Forum moved here!

# Home / Copy-and-pasting  (backtick)

ilyaz

Cutting and pasting from the file http://ilyaz.org/software/tmp/try-backtick.pdf in Adobe Acrobat creates

   a=b


(with U+0060 — as intended by the creator). However, SumatraPDF creates

  a=‘b


with U+2018.

This makes it impossible to reliably include copy-and-pastable programming code in PDF documents.

GitHubRulesOK

Edge cut & paste says Try a=‘b

the problem seems to stem from the use of a text descriptor in a not so well defined font with double filters so it may well have been inserted as
CharSet (/a/b/equal/quoteleft) but that’s not found in the as used location.

ilyaz

Let me clarify: this is just plain pdflatex running the plain code:

\documentclass{article}
\begin{document}
Try \verb|a=b|.
\end{document}


Yes, the encoding of TeX’s tt font is a complete mess. One of many things Knuth botched…

Still, this is probably the most frequent tool for generation of PDF programs’ documentation. It seems that Adobe Acrobat team considered this important enough to have some workarounds…

ilyaz

BTW, I tried to insert backquote — and the usual LaTeX ways to do this¸ as in

\documentclass{article}
\usepackage[utf8x]{inputenc}

\begin{document}
Try \verb|a=b|.

With backquote: Try \verb|a=‘b|.

Same with \verb|\texttt|: \texttt{a=‘b}
\end{document}


all produce results identical to backtick. (At least “visually identical”, and “identical for copy-and-paste”.)

GitHubRulesOK

I am not trying to prove anything other than the PDF generated has the symbols as shown ie A B = ’ which is a /quoteleft as defined in the PDF font mapping. So Edge (Try a=‘b) MuPDF & SumatraPDF (Try a=‘b. & Try a=‘b.) & PDF X-change (shown above) and other Windows apps such as STDU PDF viewer (Try a=‘b.) will generally use single quotes for plain text.

‘ Left Single Quotation Mark &#8216; &#x2018;

However on a windows key/clipboard cut and paste its just 'single' the font would need to be better defined in plain text to show it as here    a backtick

In defense of Knuth AFAIK he developed his own raster dvi fonts (a different problem) It was Adobe I think decided on PS type 3.

ilyaz

I must say that I do not understand your

on a windows key/clipboard cut and paste its just 'single'

Windows has two text subsystems: codepage and Unicode. What I described above is “pasting with Unicode subsystem”; it produces U+2018. It seems you tested with the “codepage subsystem”. (However, I have no clue what single means here.)

Anyway, low-level PDF is a bit above my head. Later today I would try to invite pdftex developers to contribute. (But anyway, whatever they may change in PDF generation would not modify tens of thousands existing PDF documents with this problem.)

GitHubRulesOK

PDF sample Extract from
SumatraPDF = Try a=‘b.

U+2018=‘ (left quote /quoteleft)
U+2019=’ (right quote /quoteright)

So Sumatra PDF extracts it as /quoteleft which is how it was stored.

ilyaz

I have no clue what you mean by “conflicts with rendering”. A glyph depends on the font — and I was not discussing glyphs. “Search” in a Unicode text is a special can of worms. Do you REALLY want me to go into discussing search?! (So search “matching something” is not a proof of anything…)

However, as I said, I want to keep as far as possible from the internal structure of PDF — so I cannot discuss the details of how this glyph is stored in PDF. I trust you that there are some problem with this. But for this, I’m essentially Cc’ing [email protected]`.