Posts from March, 2010

PDF to Word conversion

David Stevenson at 07:53 GMT on 25 March 2010

The popularity of PDF as a means for sharing information via email or the web has had an unintended consequence – the orphan PDF. PDF is a publishing, not an authoring format. If you want to change the content of a PDF, you go back to the source document – its parent if you will. But what if you can’t find the parent? Or don’t trust the source?

For this reason many organisations follow strict procedures to keep track of document versions, or implement systems that do this automatically. Such systems that integrate with popular authoring tools such as Microsoft Word are often provided as a front-end to a document management system (DMS). These systematically store iterations of the document and keep track of versions, so that any time some information is published to PDF its source is known.

Even with systems to help, the situation often arises that a PDF rendition of a document is considered the “master”, for all sorts of perfectly valid reasons. While PDF tools, including our own, can make minor text corrections (fixing a typo for example) the nature of PDF precludes large scale changes. For that you need to convert the PDF back to an authoring format. It turns out that this is a very common problem, and thousands of Google searches on “convert PDF to Word” are carried out every day.

While tools such as ours provide an “Export to Word” feature that does this job at the click of a mouse, the internal process of converting a PDF to Word is actually much harder than one might expect, and difficult to do well. Open a Word document in Page Layout view and a PDF of the same document side-by-side and you could be forgiven for thinking they are the same thing, but in fact the PDF page has an internal structure that is very different. PDF (or XPS for that matter) are fixed formats; they represent what an authoring application would print, and indeed most PDFs are generated by interpreting the data a printer uses to put marks on paper, and storing it in a form that retains all the graphical, layout and images of a printed page. By contrast, an authoring format such as Word’s DOC or DOCX is a “flow” format – the information is essentially a flow of text and graphics that are formatted to fit a page according to rules of layout. Change the width of the margins, for example, and the text will reflow to suit.

In order to convert a fixed format back to a flow format, the program has got to apply the same rules we learned as children about reading a page. In the case of Western documents, this means starting at the top left of the page, working towards the bottom right, following text columns as required, then moving on to the next page and so on. It has to follow rules about how sentences and paragraphs are formed, that text may continue after an intervening picture or diagram and so on. It then has to encode this information in manner that will, as far as possible, reproduce the same layout in the authoring format: page size, margins, text, images, font attributes and much more.

Results vary. PDF documents that convert well to Word are those with straightforward layout; more often than not they started life in Word or another word processor. Complex layouts tend to be more of a challenge; with these, there is often a choice (as there is with our tool) to place an emphasis during conversion on maintaining the layout and appearance of the page at the expense of easy text editing, or vice versa.

So it makes sense to check that a tool you are considering for this task is up to the job. Ours is available with a free trial – just go to http://www.globalgraphics.com/gdoc/free-creator and follow the link to download.

The big free software switch

David Stevenson at 12:40 GMT on 17 March 2010

The impending end of the tax year tends to focus the mind on matters financial. For businesses, controlling costs is vital; with this in mind, we put together some ideas and advice on the subject.

Global Graphics urges businesses planning for the new financial year to consider all options for savings in a tough economic climate including examining software costs.

Companies are in a flurry this month to be ready for the new financial year and if actions by the government to make cuts in order to save £11bn across government departments is anything to go by, it looks like 2010 is going to be a tough year all round.

Global Graphics is urging businesses to wake up to the idea that free doesn’t mean poor quality and now a great time to ‘spring clean’ and discover areas in which to make savings. A recent study commissioned by Global Graphics showed that 76 per cent of large organisations already use or are planning to issue free software across the enterprise in 2010. The study also demonstrated that free software is not only a practical set of desktop products but also something more fundamental to boosting enterprise productivity.

Our CEO, Gary Fry, offers some points to think about:

  • IT Budgets are still under pressure – A recent report by the National Computing Centre revealed that 28% of those surveyed admitted they have had to undertake significant cost-cutting in their IT operations and a further 32% have had to make moderate cuts. 33% are putting on hold software refreshes
  • The pros and cons of free software – Despite organisations feeling positive about free opportunities and the prospect of savings across the board, Global Graphics research found that the two main concerns for CIOs when considering free software were product quality and support. Something can be free but it needs to be reliable and offer free product upgrades and the option of free forum-led and/or paid-for support
  • Re-negotiate with current suppliers. Now is the time to shop around for the best utilities, insurance providers and every other business cost including enterprise software.
  • I’ll scratch your back if… Find opportunities for ‘contra’ arrangements with suppliers and partner organisations, either offering like-for-like services or reduced rates
  • Think about features, usability, integration and ROI – Free products vary vastly in quality and service. In the PDF market the competitors fall into two main categories: those from commercial organisations such as Global Graphics who are offering free PDF creation and the option to upgrade to a paid-for product, and freeware which is generally limited in functionality and often less easy to install and use with poor support. There is a third category, PDF creation built–in (or that can be freely downloaded) to an existing document creation product
  • Value for money – organisations are under increasing pressure to do more with less. Make sure you consider what ‘free’ really means and the broader functionalities available. Ensure that the amount of software licenses enabled fits exactly with office functions and staffing levels.

Global Graphics gDoc Creator product is the only enterprise quality PDF creation and viewing software tool that is available for free. Over 200,000 customers are now creating and viewing their documents for free with Global Graphics software. If you want a more functionality then gDoc Fusion is the paid-for big brother to Creator but is still very competitively priced.

Top five PDF security tips

David Stevenson at 14:32 GMT on 9 March 2010

Following Adobe’s admission last December of a security flaw in their free Reader, there was a lot of media attention on the issue of rogue PDFs and their potential as a vector for viruses. The furore has since died down, but the threat remains. Here are five tips to help keep you safe.

1. Keep your PDF software and virus software updated by visiting your providers’ website
2. Don’t open PDFs from people you don’t know, no matter how tempting the title!
3. Keep an eye out for any PDF security advice coming out from the likes of SANS (http://isc.sans.org/)
4. Be wary of PDF software that has had security scares or is targeted by hackers. There are alternatives.
5. If you do use free PDF software from smaller providers, make sure you know they have strong support services