Copy protection


#1

Hi,

I noticed that in some documents the text is copied as an image.
It is not convenient, could you please fix it?

For example, in the pdf viewer Evince does not have such problems. I really like your program, and I would like SumatraPDF not to have such problems.


#2

In some PDFs what looks like text is actually an image. Sumatra doesn’t do any text -> image conversion on extraction.

To do image -> text we would have to do OCR but Sumatra can’t do that (because it’s hard).


#3

It is likely 11Mihaylov is referring to “protected” text (from other user comments re no-copy DRM “Evince supports copy protection, but it can be turned off, some distros may turn it off by default”.)

In such cases standard SumatraPDF indicate it will only copy the text as an image (respecting the authors “wishes”)


#4

See also Why I have switched away from SumatraPDF : software


#5

All the instances where I find copy protected PDF files is oddly enough electronic data sheets. The entire purpose is to provide data so you can design in their parts but they don’t want you to copy any of that data directly… into your design documents for example. Crazy. So that is why I want to be able to bypass such copy protection.


#6

Why not add a program that strips PDF copy protection to your workflow? Either that, or modify Sumatra’s source and compile it so it ignores the PDF restriction flags.


#7

For one, I don’t know what a “workflow” is. Any suggestions to where to find such programs? Life is just so much easier when I can use one program to look at PDF files without having to dig around and find other stuff.


#8

“workflow” is simply the bunch of steps you carry out to get your work done. All I meant was that you can incorporate an additional step that simply strips your PDFs of any restrictions and also user/owner password(s) if required. A simple web search for “remove pdf restrictions” or “remove pdf password” or similar will result in any number of free and paid solutions that you can try to see what works best for you.


#10

Yes, extra steps. Most involve downloading other programs or using web sites to remove the restriction. This disrupts my “workflow” and makes my life more difficult.

I was reading somewhere that Sumatra originally had this capability and the author took it out because, “I decided that [Sumatra] will honor PDF creator’s wishes”. So I guess this is not a feature we will see in Sumatra anytime soon. Shame. It is silly for people to prevent copying of text from a data sheet. It’s a DATAsheet and I’m looking for data! I read one thread here where a utility protects their bills from being printed!!! WTF???


#11

What can one say about the idiocy of some organizations/companies? :roll_eyes: Perhaps if enough people petition them or complain then they can be persuaded to change their practices, but I wouldn’t bet on it.

Not in the official release at least I suppose. However the beauty of open source is that anyone who knows how can make the required changes and compile their own copy which will ignore all PDF restrictions. If you aren’t a coder but are willing to trust a modified version compiled by a long-time user, I’d like to point you to @ianas’ builds available on his Mega account. Be sure to go through the ReadMe text file first to see a complete list of changes that have been made.

Disclaimer: Please note that no support will be provided for these unofficial versions and you use them at your own risk.


#13

the DRM support is a feature that can be turned off
zeniko’s old builds ignored drm
just get the source and compile with the /D DISABLE_DOCUMENT_RESTRICTIONS flag
bypassing DRM is also illegal in the US and maybe this is why it’s on by default
I also remember reading somewhere that Krzysztof wanted to respect the pdf’s author’s choices and that’s why SumatraPDF does not ignore owner passwords
ps pdf drm can be stripped with a ghostscript command or a ton of free/payed apps or web sites


#14

Just for other’s information. The read me file you refer to doesn’t seem to have much info actually. It lists some changes as #1 #2 #3 and #4 without indicating what revision of the tool they are in. So I guess you can grab the most recent version and assume it has all four updates.


#15

The bizarre part of not allowing copying is that it doesn’t prevent copying of the document or text. It just prevents the copy and paste of text as text. Then as you say it can be easily bypassed. It’s just a PITA to have to go to a web site or pull up another tool just to facilitate a simple copy and paste!


#16

@gnuarm You are perfectly right that having to “work around” DRM is a PITA and yet it is in Adobe’s interest to provide a mechanism for the fools that think it could be of value.

I agree it is worthless to ban the masses from reading and even more foolish to think that copyrighting psalters or other illustrated work is in the interest of human education (although I feel it might have reduced the frequency of wars) However I do understand and accept that SumatraPDF has to sit on the right side of legislation, even if the law is a donkey.

Anyway enough rambling and ranting on my part. I never intentionally block readers of my works but that’s because I appreciate that although it is frustrating to see others copy ones work, the less it is crippled the further it walks.


#17

From what I understand the ReadMe applies equally to all the modified versions. In any case if one is going to trust and try the user-modified version then how long would it really take to check whether it does have all 4 updates or not (or whichever ones are of interest)?

Still, maybe @ianas will read your post and update the ReadMe to address your concerns…

Are you sure it’s mandated by legislation? When tools to remove restrictions are freely available and even sold, who says a PDF reader (especially a free one) can’t just ignore the restrictions if the dev feels like it?


#18

The legal action would not be on those who provide the software since there would be legitimate uses for it. The legal action would be against those who use it for illegal purposes.

I used to use bittorrent and got a warning from my ISP. Then later I was booted by the ISP for having a copyrighted document from bittorrent… a Xilinx application note! It may have been copyrighted, but they wrote it to share so I wasn’t doing anything wrong. lol They saved me a bunch of money by not having my own internet access. It was pretty bad service too.


#19

Peter I am not trying to contest the “rights to copy” however the jurisdiction for SumatraPDF is USA centric and I note:-

“In May 1998, the Digital Millennium Copyright Act (DMCA) passed as an amendment to US copyright law, which criminalizes the production and dissemination of technology that lets users circumvent technical copy-restriction methods”

Thus If I was based within that jurisdiction I would err on the side of caution !


#20

I would love to know whether it’s a settled matter that a FOSS reader app that ignores simple PDF restrictions would truly be considered illegal in the US. However, that’s probably a question best suited to a forum about legal issues.

In any case they’re useless restrictions that can be dealt with by anyone determined to do so, and the fact that Sumatra’s code even includes a DISABLE_DOCUMENT_RESTRICTIONS flag just waiting to be turned on makes the situation even more ridiculous.

Well, there’s no point debating about it. kjk’s decision is what it is and the official release isn’t likely to change in this regard, so all such requests are basically pointless. Folks can either go compile their own copies or use someone else’s pre-compiled version with the flag turned on. End of story.

Edit: I’ll just leave this interesting titbit here that’s straight from the horse’s mouth (i.e. the official PDF Reference from Adobe):

It is up to the implementors of PDF consumer applications to respect the intent of the document creator by restricting user access to an encrypted PDF file according to the permissions contained in the file.