I would normally say there are no line ending symbols in a PDF since the characters are placed by XY co-ordinates in a random order so that a piece of text that reads “Here be” (on the right) “ye Gold” (lower down on the Left) could be extracted as He\n re b\n e ye\n Go\n ld\n with any of those objects in any order the \n here being the physical one between xy values. I am showing that as an extreme but very common example of pdf storage where it has not been randomised.
Perhaps a simple sample may explain that better. Ignore the table in the background that may have been drawn before or after the textual content and remember that “words” are just different length blocks of font entries (there are no separate words in “ink” just letters that may include different positive or negative white space)
Here is part of that PDF content, nowhere is a literal (\n) line ending or (\t) tab between columns to be seen.
/F10 14.04 Tf (About this Code:) Tj ET Q
0.78 0.78 0.78 rg
42.52 671.06 100.35 -60.66 re B
BT /F2 16 Tf 18.4 TL 0 g 51.02 618.91 Td
(Person) Tj ET
0.78 0.78 0.78 rg 142.87 671.06 202.75 -60.66 re B
BT /F2 16 Tf 18.4 TL 0 g 151.37 618.91 Td
This is the MuPDF extraction where a line ending is injected in place of where a text block finishes. The common complaint is that tabs have not been injected for the gaps between “words” in the “table” but it is just text and lines without any association so should tabs be inserted in place of bigger word gaps, but again there are no words.
About this Code:
Each PDF decompiler can produce a different result from the above and some may replace white spaces with tab spaces or keep text blocks on one line. But PDF was designed as a one way only process, thus discards \new line \tabs and \paragraph markers. Like wise there is no font style (\b \i ) nor anything like \u for underline or strike, out those are just rectangles in page media space!.