How to convert all pdf files to text (within a folder) with one command?
I know that I can convert pdf files to text files one by one like this :
$ pdftotext filename.pdf
But is there a single command that would make that conversion without specifying separate file names so as to convert them all?
I see here, on Wikipedia, that "Wildcards (*), for example $ pdftotext *pdf, for converting multiple files, cannot be used because pdftotext expects only one file name."
pdf text convert batch
add a comment |
I know that I can convert pdf files to text files one by one like this :
$ pdftotext filename.pdf
But is there a single command that would make that conversion without specifying separate file names so as to convert them all?
I see here, on Wikipedia, that "Wildcards (*), for example $ pdftotext *pdf, for converting multiple files, cannot be used because pdftotext expects only one file name."
pdf text convert batch
add a comment |
I know that I can convert pdf files to text files one by one like this :
$ pdftotext filename.pdf
But is there a single command that would make that conversion without specifying separate file names so as to convert them all?
I see here, on Wikipedia, that "Wildcards (*), for example $ pdftotext *pdf, for converting multiple files, cannot be used because pdftotext expects only one file name."
pdf text convert batch
I know that I can convert pdf files to text files one by one like this :
$ pdftotext filename.pdf
But is there a single command that would make that conversion without specifying separate file names so as to convert them all?
I see here, on Wikipedia, that "Wildcards (*), for example $ pdftotext *pdf, for converting multiple files, cannot be used because pdftotext expects only one file name."
pdf text convert batch
pdf text convert batch
edited Nov 4 '12 at 18:41
asked Nov 4 '12 at 18:01
cipricus
10k46172338
10k46172338
add a comment |
add a comment |
5 Answers
5
active
oldest
votes
The following will convert all files in the current directory:
for file in *.pdf; do pdftotext "$file" "$file.txt"; done
its only one command, it can be typed in one line in the terminal (its the pdftotext inside a for loop in a one-line-syntax, which is what the op asked for)
– Sam
Nov 4 '12 at 18:23
check out these links for more info of how the for loop works: cyberciti.biz/faq/bash-for-loop thegeekstuff.com/2011/07/bash-for-loop-examples
– Sam
Nov 4 '12 at 18:28
would this not cause issues with non-pdf files?
– cprofitt
Nov 4 '12 at 18:30
Wouldn't this produce files like "filename.pdf.txt"?
– Ryan Thompson
Nov 4 '12 at 23:27
yes, if that's a problem, we could remove the .pdf extension using 'sed' or 'awk' but it would add complexity to the command
– Sam
Nov 5 '12 at 2:00
|
show 2 more comments
ls *.pdf | xargs -n1 pdftotext
xargs
is often a quick solution for running the same command multiple times with just a small change each time. The -n1
option makes sure that only one pdf file is passed to pdftotext at a time.
Edit: If you're worried about spaces in filenames and such, you can use this alternative:
find . -name '*.pdf' -print0 | xargs -0 -n1 pdftotext
please see my answer: can that command be adapted so as to avoid the problem mentioned there? this doesn't mean that your solution is not good, on the contrary, it does something very specific that the other alternatives here do not. but i was just curious
– cipricus
Nov 18 '12 at 23:22
2
Alternatively:ls *.pdf | xargs -L1 -I% pdftotext % %.txt
– kenorb
Aug 1 '14 at 9:18
add a comment |
write a bash script
for f in *.pdf; do
pdftotext "$f"
done
or type it in a one-line command as follows:
for f in *.pdf; do pdftotext "$f"; done
I hope this helps. I do not have a large group of .pdfs to test this on, but I use this strategy to convert my .flac files to .ogg files.
can it be done by opening the terminal in that folder and running a command instead of inserting path manually?
– cipricus
Nov 4 '12 at 18:39
couldn't you paste it here as such and insert it into your answer? that would be a good answer. i was not able to reach the good formula just by deleting a part of what you posted
– cipricus
Nov 4 '12 at 19:08
the find and xargs I initially suggested did not work when I got a chance to test them.
– cprofitt
Nov 4 '12 at 20:01
add a comment |
I have to thank first to Sam and to Ryan Thompson as well to all other answerers - for my answer here is nothing but a variation relating to the possibility of adding their solutions to Thunar's custom actions:
so, as any terminal command, a command to convert to text all pdf files within a folder can be put in the list of custom actions in Thunar file manager
The command there is find . -name '*.pdf' -print0 | xargs -0 -n1 pdftotext
, (comming from Ryan Thompson) it is the one I prefer to use, but it has a nasty turn... see below...
...it is a funny command, to be used with care: it is made to convert to text all pdf within the folder where it is fired, so, if it is fired by mistake in the home folder, it will have some unwanted effects: all your pdfs will be converted to text!
(I tested it like this: created a folder called "test" on the desktop and in it a pdf file and a series of folders within folders (/Desktop/test/a/b/c/e/f/g/h/i
) each containing the same pdf. Running that command in /Desktop/test
has converted all pdfs down to that in "i" folder.)
(I would welcome comments on how to adjust this command so as to avoid that risk.)
Replacing that with the other one (for file in *.pdf; do pdftotext "$file" "$file.txt"; done
) coming from Sam, the problem is avoided.
But in certain cases one might wish exactly what Ryan's solution does!
You can avoid thefind
command searching in subdirectories by using-maxdepth 1
. Also, when putting it into Thunar's custom actions feature, you should probably replacefind .
withfind %F
to allow Thunar to properly pass the paths of the selected directories.
– Ryan Thompson
Nov 19 '12 at 1:10
add a comment |
for file in *.pdf; do pdftotext "$file" "$file.txt"; done
This one outputs sample.pdf.txt.
I tried using this one, as user2357111317 suggest and I also include -layout to preserve the layout of the text
for file in *.pdf; do pdftotext -layout "$file"; done
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "89"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f211870%2fhow-to-convert-all-pdf-files-to-text-within-a-folder-with-one-command%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
The following will convert all files in the current directory:
for file in *.pdf; do pdftotext "$file" "$file.txt"; done
its only one command, it can be typed in one line in the terminal (its the pdftotext inside a for loop in a one-line-syntax, which is what the op asked for)
– Sam
Nov 4 '12 at 18:23
check out these links for more info of how the for loop works: cyberciti.biz/faq/bash-for-loop thegeekstuff.com/2011/07/bash-for-loop-examples
– Sam
Nov 4 '12 at 18:28
would this not cause issues with non-pdf files?
– cprofitt
Nov 4 '12 at 18:30
Wouldn't this produce files like "filename.pdf.txt"?
– Ryan Thompson
Nov 4 '12 at 23:27
yes, if that's a problem, we could remove the .pdf extension using 'sed' or 'awk' but it would add complexity to the command
– Sam
Nov 5 '12 at 2:00
|
show 2 more comments
The following will convert all files in the current directory:
for file in *.pdf; do pdftotext "$file" "$file.txt"; done
its only one command, it can be typed in one line in the terminal (its the pdftotext inside a for loop in a one-line-syntax, which is what the op asked for)
– Sam
Nov 4 '12 at 18:23
check out these links for more info of how the for loop works: cyberciti.biz/faq/bash-for-loop thegeekstuff.com/2011/07/bash-for-loop-examples
– Sam
Nov 4 '12 at 18:28
would this not cause issues with non-pdf files?
– cprofitt
Nov 4 '12 at 18:30
Wouldn't this produce files like "filename.pdf.txt"?
– Ryan Thompson
Nov 4 '12 at 23:27
yes, if that's a problem, we could remove the .pdf extension using 'sed' or 'awk' but it would add complexity to the command
– Sam
Nov 5 '12 at 2:00
|
show 2 more comments
The following will convert all files in the current directory:
for file in *.pdf; do pdftotext "$file" "$file.txt"; done
The following will convert all files in the current directory:
for file in *.pdf; do pdftotext "$file" "$file.txt"; done
edited Dec 26 at 3:20
Pablo Bianchi
2,3571528
2,3571528
answered Nov 4 '12 at 18:16
Sam
1,993910
1,993910
its only one command, it can be typed in one line in the terminal (its the pdftotext inside a for loop in a one-line-syntax, which is what the op asked for)
– Sam
Nov 4 '12 at 18:23
check out these links for more info of how the for loop works: cyberciti.biz/faq/bash-for-loop thegeekstuff.com/2011/07/bash-for-loop-examples
– Sam
Nov 4 '12 at 18:28
would this not cause issues with non-pdf files?
– cprofitt
Nov 4 '12 at 18:30
Wouldn't this produce files like "filename.pdf.txt"?
– Ryan Thompson
Nov 4 '12 at 23:27
yes, if that's a problem, we could remove the .pdf extension using 'sed' or 'awk' but it would add complexity to the command
– Sam
Nov 5 '12 at 2:00
|
show 2 more comments
its only one command, it can be typed in one line in the terminal (its the pdftotext inside a for loop in a one-line-syntax, which is what the op asked for)
– Sam
Nov 4 '12 at 18:23
check out these links for more info of how the for loop works: cyberciti.biz/faq/bash-for-loop thegeekstuff.com/2011/07/bash-for-loop-examples
– Sam
Nov 4 '12 at 18:28
would this not cause issues with non-pdf files?
– cprofitt
Nov 4 '12 at 18:30
Wouldn't this produce files like "filename.pdf.txt"?
– Ryan Thompson
Nov 4 '12 at 23:27
yes, if that's a problem, we could remove the .pdf extension using 'sed' or 'awk' but it would add complexity to the command
– Sam
Nov 5 '12 at 2:00
its only one command, it can be typed in one line in the terminal (its the pdftotext inside a for loop in a one-line-syntax, which is what the op asked for)
– Sam
Nov 4 '12 at 18:23
its only one command, it can be typed in one line in the terminal (its the pdftotext inside a for loop in a one-line-syntax, which is what the op asked for)
– Sam
Nov 4 '12 at 18:23
check out these links for more info of how the for loop works: cyberciti.biz/faq/bash-for-loop thegeekstuff.com/2011/07/bash-for-loop-examples
– Sam
Nov 4 '12 at 18:28
check out these links for more info of how the for loop works: cyberciti.biz/faq/bash-for-loop thegeekstuff.com/2011/07/bash-for-loop-examples
– Sam
Nov 4 '12 at 18:28
would this not cause issues with non-pdf files?
– cprofitt
Nov 4 '12 at 18:30
would this not cause issues with non-pdf files?
– cprofitt
Nov 4 '12 at 18:30
Wouldn't this produce files like "filename.pdf.txt"?
– Ryan Thompson
Nov 4 '12 at 23:27
Wouldn't this produce files like "filename.pdf.txt"?
– Ryan Thompson
Nov 4 '12 at 23:27
yes, if that's a problem, we could remove the .pdf extension using 'sed' or 'awk' but it would add complexity to the command
– Sam
Nov 5 '12 at 2:00
yes, if that's a problem, we could remove the .pdf extension using 'sed' or 'awk' but it would add complexity to the command
– Sam
Nov 5 '12 at 2:00
|
show 2 more comments
ls *.pdf | xargs -n1 pdftotext
xargs
is often a quick solution for running the same command multiple times with just a small change each time. The -n1
option makes sure that only one pdf file is passed to pdftotext at a time.
Edit: If you're worried about spaces in filenames and such, you can use this alternative:
find . -name '*.pdf' -print0 | xargs -0 -n1 pdftotext
please see my answer: can that command be adapted so as to avoid the problem mentioned there? this doesn't mean that your solution is not good, on the contrary, it does something very specific that the other alternatives here do not. but i was just curious
– cipricus
Nov 18 '12 at 23:22
2
Alternatively:ls *.pdf | xargs -L1 -I% pdftotext % %.txt
– kenorb
Aug 1 '14 at 9:18
add a comment |
ls *.pdf | xargs -n1 pdftotext
xargs
is often a quick solution for running the same command multiple times with just a small change each time. The -n1
option makes sure that only one pdf file is passed to pdftotext at a time.
Edit: If you're worried about spaces in filenames and such, you can use this alternative:
find . -name '*.pdf' -print0 | xargs -0 -n1 pdftotext
please see my answer: can that command be adapted so as to avoid the problem mentioned there? this doesn't mean that your solution is not good, on the contrary, it does something very specific that the other alternatives here do not. but i was just curious
– cipricus
Nov 18 '12 at 23:22
2
Alternatively:ls *.pdf | xargs -L1 -I% pdftotext % %.txt
– kenorb
Aug 1 '14 at 9:18
add a comment |
ls *.pdf | xargs -n1 pdftotext
xargs
is often a quick solution for running the same command multiple times with just a small change each time. The -n1
option makes sure that only one pdf file is passed to pdftotext at a time.
Edit: If you're worried about spaces in filenames and such, you can use this alternative:
find . -name '*.pdf' -print0 | xargs -0 -n1 pdftotext
ls *.pdf | xargs -n1 pdftotext
xargs
is often a quick solution for running the same command multiple times with just a small change each time. The -n1
option makes sure that only one pdf file is passed to pdftotext at a time.
Edit: If you're worried about spaces in filenames and such, you can use this alternative:
find . -name '*.pdf' -print0 | xargs -0 -n1 pdftotext
edited Nov 5 '12 at 0:59
answered Nov 4 '12 at 23:24
Ryan Thompson
2,71332134
2,71332134
please see my answer: can that command be adapted so as to avoid the problem mentioned there? this doesn't mean that your solution is not good, on the contrary, it does something very specific that the other alternatives here do not. but i was just curious
– cipricus
Nov 18 '12 at 23:22
2
Alternatively:ls *.pdf | xargs -L1 -I% pdftotext % %.txt
– kenorb
Aug 1 '14 at 9:18
add a comment |
please see my answer: can that command be adapted so as to avoid the problem mentioned there? this doesn't mean that your solution is not good, on the contrary, it does something very specific that the other alternatives here do not. but i was just curious
– cipricus
Nov 18 '12 at 23:22
2
Alternatively:ls *.pdf | xargs -L1 -I% pdftotext % %.txt
– kenorb
Aug 1 '14 at 9:18
please see my answer: can that command be adapted so as to avoid the problem mentioned there? this doesn't mean that your solution is not good, on the contrary, it does something very specific that the other alternatives here do not. but i was just curious
– cipricus
Nov 18 '12 at 23:22
please see my answer: can that command be adapted so as to avoid the problem mentioned there? this doesn't mean that your solution is not good, on the contrary, it does something very specific that the other alternatives here do not. but i was just curious
– cipricus
Nov 18 '12 at 23:22
2
2
Alternatively:
ls *.pdf | xargs -L1 -I% pdftotext % %.txt
– kenorb
Aug 1 '14 at 9:18
Alternatively:
ls *.pdf | xargs -L1 -I% pdftotext % %.txt
– kenorb
Aug 1 '14 at 9:18
add a comment |
write a bash script
for f in *.pdf; do
pdftotext "$f"
done
or type it in a one-line command as follows:
for f in *.pdf; do pdftotext "$f"; done
I hope this helps. I do not have a large group of .pdfs to test this on, but I use this strategy to convert my .flac files to .ogg files.
can it be done by opening the terminal in that folder and running a command instead of inserting path manually?
– cipricus
Nov 4 '12 at 18:39
couldn't you paste it here as such and insert it into your answer? that would be a good answer. i was not able to reach the good formula just by deleting a part of what you posted
– cipricus
Nov 4 '12 at 19:08
the find and xargs I initially suggested did not work when I got a chance to test them.
– cprofitt
Nov 4 '12 at 20:01
add a comment |
write a bash script
for f in *.pdf; do
pdftotext "$f"
done
or type it in a one-line command as follows:
for f in *.pdf; do pdftotext "$f"; done
I hope this helps. I do not have a large group of .pdfs to test this on, but I use this strategy to convert my .flac files to .ogg files.
can it be done by opening the terminal in that folder and running a command instead of inserting path manually?
– cipricus
Nov 4 '12 at 18:39
couldn't you paste it here as such and insert it into your answer? that would be a good answer. i was not able to reach the good formula just by deleting a part of what you posted
– cipricus
Nov 4 '12 at 19:08
the find and xargs I initially suggested did not work when I got a chance to test them.
– cprofitt
Nov 4 '12 at 20:01
add a comment |
write a bash script
for f in *.pdf; do
pdftotext "$f"
done
or type it in a one-line command as follows:
for f in *.pdf; do pdftotext "$f"; done
I hope this helps. I do not have a large group of .pdfs to test this on, but I use this strategy to convert my .flac files to .ogg files.
write a bash script
for f in *.pdf; do
pdftotext "$f"
done
or type it in a one-line command as follows:
for f in *.pdf; do pdftotext "$f"; done
I hope this helps. I do not have a large group of .pdfs to test this on, but I use this strategy to convert my .flac files to .ogg files.
edited Nov 5 '12 at 0:59
user76204
answered Nov 4 '12 at 18:29
cprofitt
6,09512250
6,09512250
can it be done by opening the terminal in that folder and running a command instead of inserting path manually?
– cipricus
Nov 4 '12 at 18:39
couldn't you paste it here as such and insert it into your answer? that would be a good answer. i was not able to reach the good formula just by deleting a part of what you posted
– cipricus
Nov 4 '12 at 19:08
the find and xargs I initially suggested did not work when I got a chance to test them.
– cprofitt
Nov 4 '12 at 20:01
add a comment |
can it be done by opening the terminal in that folder and running a command instead of inserting path manually?
– cipricus
Nov 4 '12 at 18:39
couldn't you paste it here as such and insert it into your answer? that would be a good answer. i was not able to reach the good formula just by deleting a part of what you posted
– cipricus
Nov 4 '12 at 19:08
the find and xargs I initially suggested did not work when I got a chance to test them.
– cprofitt
Nov 4 '12 at 20:01
can it be done by opening the terminal in that folder and running a command instead of inserting path manually?
– cipricus
Nov 4 '12 at 18:39
can it be done by opening the terminal in that folder and running a command instead of inserting path manually?
– cipricus
Nov 4 '12 at 18:39
couldn't you paste it here as such and insert it into your answer? that would be a good answer. i was not able to reach the good formula just by deleting a part of what you posted
– cipricus
Nov 4 '12 at 19:08
couldn't you paste it here as such and insert it into your answer? that would be a good answer. i was not able to reach the good formula just by deleting a part of what you posted
– cipricus
Nov 4 '12 at 19:08
the find and xargs I initially suggested did not work when I got a chance to test them.
– cprofitt
Nov 4 '12 at 20:01
the find and xargs I initially suggested did not work when I got a chance to test them.
– cprofitt
Nov 4 '12 at 20:01
add a comment |
I have to thank first to Sam and to Ryan Thompson as well to all other answerers - for my answer here is nothing but a variation relating to the possibility of adding their solutions to Thunar's custom actions:
so, as any terminal command, a command to convert to text all pdf files within a folder can be put in the list of custom actions in Thunar file manager
The command there is find . -name '*.pdf' -print0 | xargs -0 -n1 pdftotext
, (comming from Ryan Thompson) it is the one I prefer to use, but it has a nasty turn... see below...
...it is a funny command, to be used with care: it is made to convert to text all pdf within the folder where it is fired, so, if it is fired by mistake in the home folder, it will have some unwanted effects: all your pdfs will be converted to text!
(I tested it like this: created a folder called "test" on the desktop and in it a pdf file and a series of folders within folders (/Desktop/test/a/b/c/e/f/g/h/i
) each containing the same pdf. Running that command in /Desktop/test
has converted all pdfs down to that in "i" folder.)
(I would welcome comments on how to adjust this command so as to avoid that risk.)
Replacing that with the other one (for file in *.pdf; do pdftotext "$file" "$file.txt"; done
) coming from Sam, the problem is avoided.
But in certain cases one might wish exactly what Ryan's solution does!
You can avoid thefind
command searching in subdirectories by using-maxdepth 1
. Also, when putting it into Thunar's custom actions feature, you should probably replacefind .
withfind %F
to allow Thunar to properly pass the paths of the selected directories.
– Ryan Thompson
Nov 19 '12 at 1:10
add a comment |
I have to thank first to Sam and to Ryan Thompson as well to all other answerers - for my answer here is nothing but a variation relating to the possibility of adding their solutions to Thunar's custom actions:
so, as any terminal command, a command to convert to text all pdf files within a folder can be put in the list of custom actions in Thunar file manager
The command there is find . -name '*.pdf' -print0 | xargs -0 -n1 pdftotext
, (comming from Ryan Thompson) it is the one I prefer to use, but it has a nasty turn... see below...
...it is a funny command, to be used with care: it is made to convert to text all pdf within the folder where it is fired, so, if it is fired by mistake in the home folder, it will have some unwanted effects: all your pdfs will be converted to text!
(I tested it like this: created a folder called "test" on the desktop and in it a pdf file and a series of folders within folders (/Desktop/test/a/b/c/e/f/g/h/i
) each containing the same pdf. Running that command in /Desktop/test
has converted all pdfs down to that in "i" folder.)
(I would welcome comments on how to adjust this command so as to avoid that risk.)
Replacing that with the other one (for file in *.pdf; do pdftotext "$file" "$file.txt"; done
) coming from Sam, the problem is avoided.
But in certain cases one might wish exactly what Ryan's solution does!
You can avoid thefind
command searching in subdirectories by using-maxdepth 1
. Also, when putting it into Thunar's custom actions feature, you should probably replacefind .
withfind %F
to allow Thunar to properly pass the paths of the selected directories.
– Ryan Thompson
Nov 19 '12 at 1:10
add a comment |
I have to thank first to Sam and to Ryan Thompson as well to all other answerers - for my answer here is nothing but a variation relating to the possibility of adding their solutions to Thunar's custom actions:
so, as any terminal command, a command to convert to text all pdf files within a folder can be put in the list of custom actions in Thunar file manager
The command there is find . -name '*.pdf' -print0 | xargs -0 -n1 pdftotext
, (comming from Ryan Thompson) it is the one I prefer to use, but it has a nasty turn... see below...
...it is a funny command, to be used with care: it is made to convert to text all pdf within the folder where it is fired, so, if it is fired by mistake in the home folder, it will have some unwanted effects: all your pdfs will be converted to text!
(I tested it like this: created a folder called "test" on the desktop and in it a pdf file and a series of folders within folders (/Desktop/test/a/b/c/e/f/g/h/i
) each containing the same pdf. Running that command in /Desktop/test
has converted all pdfs down to that in "i" folder.)
(I would welcome comments on how to adjust this command so as to avoid that risk.)
Replacing that with the other one (for file in *.pdf; do pdftotext "$file" "$file.txt"; done
) coming from Sam, the problem is avoided.
But in certain cases one might wish exactly what Ryan's solution does!
I have to thank first to Sam and to Ryan Thompson as well to all other answerers - for my answer here is nothing but a variation relating to the possibility of adding their solutions to Thunar's custom actions:
so, as any terminal command, a command to convert to text all pdf files within a folder can be put in the list of custom actions in Thunar file manager
The command there is find . -name '*.pdf' -print0 | xargs -0 -n1 pdftotext
, (comming from Ryan Thompson) it is the one I prefer to use, but it has a nasty turn... see below...
...it is a funny command, to be used with care: it is made to convert to text all pdf within the folder where it is fired, so, if it is fired by mistake in the home folder, it will have some unwanted effects: all your pdfs will be converted to text!
(I tested it like this: created a folder called "test" on the desktop and in it a pdf file and a series of folders within folders (/Desktop/test/a/b/c/e/f/g/h/i
) each containing the same pdf. Running that command in /Desktop/test
has converted all pdfs down to that in "i" folder.)
(I would welcome comments on how to adjust this command so as to avoid that risk.)
Replacing that with the other one (for file in *.pdf; do pdftotext "$file" "$file.txt"; done
) coming from Sam, the problem is avoided.
But in certain cases one might wish exactly what Ryan's solution does!
edited Apr 13 '17 at 12:24
Community♦
1
1
answered Nov 18 '12 at 22:33
cipricus
10k46172338
10k46172338
You can avoid thefind
command searching in subdirectories by using-maxdepth 1
. Also, when putting it into Thunar's custom actions feature, you should probably replacefind .
withfind %F
to allow Thunar to properly pass the paths of the selected directories.
– Ryan Thompson
Nov 19 '12 at 1:10
add a comment |
You can avoid thefind
command searching in subdirectories by using-maxdepth 1
. Also, when putting it into Thunar's custom actions feature, you should probably replacefind .
withfind %F
to allow Thunar to properly pass the paths of the selected directories.
– Ryan Thompson
Nov 19 '12 at 1:10
You can avoid the
find
command searching in subdirectories by using -maxdepth 1
. Also, when putting it into Thunar's custom actions feature, you should probably replace find .
with find %F
to allow Thunar to properly pass the paths of the selected directories.– Ryan Thompson
Nov 19 '12 at 1:10
You can avoid the
find
command searching in subdirectories by using -maxdepth 1
. Also, when putting it into Thunar's custom actions feature, you should probably replace find .
with find %F
to allow Thunar to properly pass the paths of the selected directories.– Ryan Thompson
Nov 19 '12 at 1:10
add a comment |
for file in *.pdf; do pdftotext "$file" "$file.txt"; done
This one outputs sample.pdf.txt.
I tried using this one, as user2357111317 suggest and I also include -layout to preserve the layout of the text
for file in *.pdf; do pdftotext -layout "$file"; done
add a comment |
for file in *.pdf; do pdftotext "$file" "$file.txt"; done
This one outputs sample.pdf.txt.
I tried using this one, as user2357111317 suggest and I also include -layout to preserve the layout of the text
for file in *.pdf; do pdftotext -layout "$file"; done
add a comment |
for file in *.pdf; do pdftotext "$file" "$file.txt"; done
This one outputs sample.pdf.txt.
I tried using this one, as user2357111317 suggest and I also include -layout to preserve the layout of the text
for file in *.pdf; do pdftotext -layout "$file"; done
for file in *.pdf; do pdftotext "$file" "$file.txt"; done
This one outputs sample.pdf.txt.
I tried using this one, as user2357111317 suggest and I also include -layout to preserve the layout of the text
for file in *.pdf; do pdftotext -layout "$file"; done
edited May 21 at 11:52
Thomas
3,53581427
3,53581427
answered May 21 at 11:13
hinky
11
11
add a comment |
add a comment |
Thanks for contributing an answer to Ask Ubuntu!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f211870%2fhow-to-convert-all-pdf-files-to-text-within-a-folder-with-one-command%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown