I am using Inkscape to convert vector/text PDF files to SVG, for import into Visio.
When I use the Poplar/Cairo import option, the SVG is identical to the PDF, but all of the text is converted to paths, resulting in a huge file.
When I use the Internal import option, the text stays as text, resulting in a much smaller file, but some of the text has moved and changed attributes (font size).
I suspect this is a PDF issue and not an Inkscape issue, since I get the same results when using online PDF to SVG converters.
Does anyone know why this is happening or how to fix it?
[Internal Import] gives you two options. [Substitute missing fonts] converts pdf text into svg text. However, if your system doesn't have the pdf fonts installed, Inkscape substitutes the closest matching font. I don't know how this "match" is selected and the results aren't always pretty. Sometimes the original font information may be missing completely. If you can identify, find and install the missing pdf fonts, your svg should be a near-perfect duplicate. (Sharing svg images also has this problem, if your recipient doesn't have your fonts.)
[Draw missing fonts] converts pdf text to svg paths. This generates a lot of nodes and the file size grows, as you saw.
[Cairo Import] creates a shape for every glyph (letter, digit, punctuation, etc.) in the pdf. Each glyph is stored only once. What you see on the svg page are clones of these stored shapes. This is much more efficient than drawing individual letters, especially for "wordy" pdf documents. I suspect that Visio converts these clones to paths, increasing the file size again.
How to fix it?
That depends what you want to do with the image. If you want to edit the text, use [Substitute missing fonts] and choose a suitable font in Visio. There are further complications. Pdf text is often broken into seemingly arbitrary blocks with strange kerning, so text flow and layout might be weird. If you want perfect visual fidelity you're probably stuck with large file sizes.
The files are electronic circuit layouts with a lot of small parts, mostly rectangles and circles, to which I want to add some annotations in Visio. Each shape has a designation (R1, R2, ... for resistors, C1, C2 ... for capacitors, etc.), and since there are hundreds of parts there is a lot of text. Automatic font substitutions would be fine, but text moving around.
I think I will first try substituting fonts in the PDF with fonts available on my computer, and see what happens. if that doesn't work, I can just move the shifted text back to its correct location in Visio, assuming I can spot all the instances.
I suppose I could also add my annotations directly in Inkscape, but then it would be difficult for others, who aren't familiar with Inkscape, to edit the diagrams at a later time if necessary.
In any event, you now given me some directions to explore; thanks for that.
I've done similar importing pdf architectural drawings to Inkscape and manually correcting some wayward text. As you say, it can be hard to spot the differences.
When this happens, I first use Cairo import for high-fidelity. Select everything and [Edit > Make a Bitmap Copy]. Set the bitmap opacity low, move it to another layer, and lock the layer. Delete the imported image.
Now load the pdf again using [Internal Import] [Substitute missing fonts]. Any mismatch with the transparent bitmap should be obvious immediately. You may want to adjust colours to increase the contrast.
I am using Inkscape to convert vector/text PDF files to SVG, for import into Visio.
When I use the Poplar/Cairo import option, the SVG is identical to the PDF, but all of the text is converted to paths, resulting in a huge file.
When I use the Internal import option, the text stays as text, resulting in a much smaller file, but some of the text has moved and changed attributes (font size).
I suspect this is a PDF issue and not an Inkscape issue, since I get the same results when using online PDF to SVG converters.
Does anyone know why this is happening or how to fix it?
Thanks
Here's what's happening.
[Internal Import] gives you two options. [Substitute missing fonts] converts pdf text into svg text. However, if your system doesn't have the pdf fonts installed, Inkscape substitutes the closest matching font. I don't know how this "match" is selected and the results aren't always pretty. Sometimes the original font information may be missing completely. If you can identify, find and install the missing pdf fonts, your svg should be a near-perfect duplicate. (Sharing svg images also has this problem, if your recipient doesn't have your fonts.)
[Draw missing fonts] converts pdf text to svg paths. This generates a lot of nodes and the file size grows, as you saw.
[Cairo Import] creates a shape for every glyph (letter, digit, punctuation, etc.) in the pdf. Each glyph is stored only once. What you see on the svg page are clones of these stored shapes. This is much more efficient than drawing individual letters, especially for "wordy" pdf documents. I suspect that Visio converts these clones to paths, increasing the file size again.
How to fix it?
That depends what you want to do with the image. If you want to edit the text, use [Substitute missing fonts] and choose a suitable font in Visio. There are further complications. Pdf text is often broken into seemingly arbitrary blocks with strange kerning, so text flow and layout might be weird. If you want perfect visual fidelity you're probably stuck with large file sizes.
@PaddyCAD, thanks for the explanation.
The files are electronic circuit layouts with a lot of small parts, mostly rectangles and circles, to which I want to add some annotations in Visio. Each shape has a designation (R1, R2, ... for resistors, C1, C2 ... for capacitors, etc.), and since there are hundreds of parts there is a lot of text. Automatic font substitutions would be fine, but text moving around.
I think I will first try substituting fonts in the PDF with fonts available on my computer, and see what happens. if that doesn't work, I can just move the shifted text back to its correct location in Visio, assuming I can spot all the instances.
I suppose I could also add my annotations directly in Inkscape, but then it would be difficult for others, who aren't familiar with Inkscape, to edit the diagrams at a later time if necessary.
In any event, you now given me some directions to explore; thanks for that.
I've done similar importing pdf architectural drawings to Inkscape and manually correcting some wayward text. As you say, it can be hard to spot the differences.
When this happens, I first use Cairo import for high-fidelity. Select everything and [Edit > Make a Bitmap Copy]. Set the bitmap opacity low, move it to another layer, and lock the layer. Delete the imported image.
Now load the pdf again using [Internal Import] [Substitute missing fonts]. Any mismatch with the transparent bitmap should be obvious immediately. You may want to adjust colours to increase the contrast.