This week we announced a groundbreaking update to Mako Core™, our popular software development kit (SDK) for developers specializing in print document processing and manipulation.
Mako now includes a game-changing intelligent Document Object Model (iDOM) that provides unparalleled access to every aspect of a document’s composition, including fonts, images, vector art, layers and metadata. Unlike other SDKs that offer generic support for various Page Description Languages such as PDF, PS, OXPS, and PCL across print and digital document domains, Mako is sharply focused on print.
Recently I was talking to one of our Mako™ partners, Actino Software, who are based in Germany. Actino have successfully built many document solutions for their customers with Mako, and a recent example was for a large German crane manufacturer with subsidiaries worldwide. The design, manufacture and maintenance of cranes requires a lot of documentation, and the company estimates they manage around 100,000 documents, mainly in PDF.
There are numerous challenges to managing such a large corpus of documents, and one of those is the quality of those documents. I’m not talking here about the content, but rather the way in which the PDF was produced. Many different authoring and PDF creation tools are in use, some going back decades, resulting in a wide variation in the way PDFs are constructed. This is where Mako comes in.
For this project, Actino Software built centralized services that analyze and optimize the PDFs and prepare them for secure delivery. Mako can do this quickly, processing 400 MB files with many thousands of pages, fast. The analysis function identifies the number of pages, bookmarks, fonts, embedded attachments and other features, including document metadata (title, subject etc). The optimize function removes invalid hyperlinks, changes named destinations into working hyperlinks, and resaves the file to a reduce file size. In this particular case, Mako reduced the file size by more than 50% thanks to image resampling. Mako’s built-in font optimization (eliminating duplicate fonts and merging of font subsets) is another way it can reduce file size.
This work supports two DRM (Digital Rights Management) workflows:
Distribution via a portable storage device such as thumb drive, supported by a plug-in to Adobe Acrobat Reader that talks back to the company website for authentication.
Download from a secure website, for which small file size is a must.
Actino were able to meet the customer requirements with a rapid development schedule, based on their understanding of document workflows and their familiarity with the Mako Core SDK. Their customer chose the Actino solution over an Adobe DRM solution, which was considered too expensive and rigid.
When talking with digital press vendors it soon becomes apparent that the only thing more important than speed is quality; the only thing more important than quality is cost; and the only thing more important than cost is speed. I think I’d have to ask M C Escher for an illustration of that!
To focus on speed, what a press vendor usually means when talking to Global Graphics Software is “I need the Digital Front End (DFE) for my press to be able to print every job at full engine speed”, which is a subject that we’re very happy to talk about and to demonstrate solutions for, even as the press engines themselves get faster with every new version.
But the components such as the RIP in a DFE are not the only things that can affect whether a press can be driven at full engine speed. There are plenty of things that a designer or composition engine can do that can vary how fast a PDF file can be RIPped by several orders of magnitude, without affecting the visual appearance of the print.
Obviously we like it when the files are efficiently built, but sometimes it’s not obvious to a designer, or a software developer working on either a design application or composition engine how they might be able to improve the files that they generate. That’s why we created a guide called “Do PDF/VT Right” back in 2014, stuffed full of actionable recommendations for both designers and developers making PDF files for variable data printing.
It’s been very well received, and clearly filled a gap in materials available for the target audience; there have been thousands of downloads and printed copies given away at trade shows.
At the end of 2020 a new PDF/VT standard, PDF/VT-3, was published, and the committee in ISO that had developed it asked the PDF Association to write application notes for it, to assist developers implementing it with more extensive detail than can be included in International Standards. That sounds very formal, but in practice the two committees have many members in common (as an example I was project editor on PDF/VT-3 and I co-chair the PDF Association’s PDF/VT Technical Working Group (TWG)). The hand-over was mainly to enable much more agile and responsive document development and more flexibility around publication.
After some debate the PDF/VT TWG decided that what the industry really needed was a best-practice guide in how to construct efficient PDF files for VDP, whether they’re PDF/VT or ‘just’ “optimized PDF”. Any developer who has worked with PDF/VT-1 should have no trouble in implementing VT-3, but there are still some issues with slow processing of very inefficient PDF files preventing print service providers, converters etc from running their digital presses at full engine speed.
The next step was to agree to do the development of that guide in a new form of committee within the PDF Association, specifically so that people who were not members of the Association could be involved.
At this stage Global Graphics offered the text of Full Speed Ahead as a starting point for the Association guide, an offer that was very quickly accepted. But it was felt that it could be made more accessible if two editions of the guide were produced: one for designers and one for developers, rather than combining the two into a single document. Amongst other things it means that each guide can use the most appropriate terminology for each audience, which always makes reading easier.
We were lucky to have Pat McGrew working with us and she took over as champion for the designer edition, while I led on the developer one.
And so I’m very happy to announce that both the Developer and Designer editions of the PDF Association’s “Best Practice in creating PDF files for Variable Data Printing” have just been published and are available with a lot of other useful resources at https://www.pdfa.org/resources/.
In this blog post, Martin Bailey recalls his days as the first chair of the ISO PDF/X task force and how the standard has developed over the last 20 years.
Over the last few years there has been quite an outpouring of nostalgia around PDF. That was first for PDF itself, but at the end of 2021 we reached two decades since the first publication of an ISO PDF/X standard.
I’d been involved with PDF/X in its original home of CGATS (the Committee for Graphic Arts Technical Standards, the body accredited by ANSI to develop US national standards for printing) for several years before it moved to ISO. And then I became the first chair of the PDF/X task force in ISO. So I thought I’d add a few words to the pile, and those have now been published on the PDF Association’s web site at https://www.pdfa.org/the-route-to-pdf-x-and-where-we-are-now-a-personal-history/.
I realised while I was writing it that it really was a personal history for me. PDF/X was one of the first standards that I was involved in developing, back when the very idea of software standards was quite novel. Since then, supported and encouraged by Harlequin and Global Graphics Software, I’ve also worked on standards and chaired committees in CIP3, CIP4, Ecma, the Ghent Working Group, ISO and the PDF Association (I apologise if I’ve missed any off that list!).
It would be easy to assume that working on all of those standards meant that I knew a lot about what we were standardising from day one. But the reality is that I’ve learned a huge amount of what I know about print from being involved, and from talking to a lot of people.
Perhaps the most important lesson was that you can’t (or at least shouldn’t) only take into account your own use cases while writing a standard. Most of the time a standard that satisfies only a single company should just be proprietary working practice instead. It’s only valuable as a standard if it enables technologies, products and workflows in many different companies.
That sounds as if it should be obvious, but the second major lesson was something that has been very useful in environments outside of standards as well. An awful lot of people assume that everyone cares a lot about the things that they care about, and that everything else is unimportant. As an example, next time you’re at a trade show (assuming they ever come back in their historical form) take a look and see how many vendors claim to have product for “the whole workflow”. Trust me, for production printing, nobody has product for the whole workflow. Each one just means that they have product for the bits of the workflow that they think are important. The trouble is that you can’t actually print stuff effectively and profitably if all you have is those ‘important’ bits. To write a good standard you have to take off the blinkers and see beyond what your own products and workflows are doing. And in doing that I’ve found that it also teaches you more about what your own ‘important’ parts of the workflow need to do.
Along the way I’ve also met some wonderful people and made some good friends. Our conversations may have a tendency to dip in and out of print geek topics, but sometimes those are best covered over a beer or two!
At the beginning of 2020, in what we thought was the run-up to drupa, Global Graphics published a new guide called “Full Speed Ahead: How to make variable data PDF files that won’t slow your digital press”. It was designed to complement the recommendations available for how to maximize sales from direct mail campaigns, with technical recommendations as to how you can make sure that you don’t make a PDF file for a variable data job that will bring a digital press to its knees. It also carried those lessons into additional print sectors that are rapidly adopting variable data, such as labels, packaging, product decoration and industrial print, with hints around using variable data in unusual ways for premium jobs at premium margins.
Well, as they say, a lot has happened since then.
And some of that has been positive. At the end of 2020 several new International Standards were published, including a “dated revision” (a 2nd edition) of the PDF 2.0 standard, a new standard for submission of PDF files for production printing: PDF/X-6, and a new standard for submission of variable data PDF files for printing: PDF/VT-3.
We’ve therefore updated Full Speed Ahead to cover the new standards. And at the same time we’ve taken the opportunity to extend and clarify some of the rest of the text in response to feedback on the first edition.
So now you can keep up to date, just by downloading the new edition!
Would you fill your brand-new Ferrari with cheap and inferior fuel? It’s a question posed by Martin Bailey in his new guide: ‘Full Speed Ahead – how to make variable data PDF files that won’t slow your digital press’. It’s an analogy he uses to explain the importance of putting well-constructed PDF files through your DFE so that they don’t disrupt the printing process and the DFE runs as efficiently as possible.
Here are Martin’s recommendations to help you avoid making jobs that delay the printing process, so you can be assured that you’ll meet your print deadline reliably and achieve your printing goals effectively:
If you’re printing work that doesn’t make use of variable data on a digital press, you’re probably producing short runs. If you weren’t, you’d be more likely to choose an offset or flexo press instead. But “short runs” very rarely means a single copy.
Let’s assume that you’re printing, for example, 50 copies of a series of booklets, or of an imposed form of labels. In this case the DFE on your digital press only needs to RIP each PDF page once.
To continue the example, let’s assume that you’re printing on a press that can produce 100 pages per minute (or the equivalent area for labels etc.). If all your jobs are 50 copies long, you therefore need to RIP jobs at only two pages per minute (100ppm/50 copies). Once a job is fully RIPped and the copies are running on press you have plenty of time to get the next job prepared before the current one clears the press.
But VDP jobs place additional demands on the processing power available in a DFE because most pages are different to every other page and must therefore each be RIPped separately. If you’re printing at 100 pages per minute the DFE must RIP at 100 pages per minute; fifty times faster than it needed to process for fifty copies of a static job.
Each minor inefficiency in a VDP job will often only add between a few milliseconds and a second or two to the processing of each page, but those times need to be multiplied up by the number of pages in the job. An individual delay of half a second on every page of a 10,000-page job adds up to around an hour and a half for the whole job. For a really big job of a million pages it only takes an extra tenth of a second per page to add 24 hours to the total processing time.
If you’re printing at 120ppm the DFE must process each page in an average of half a second or less to keep up with the press. The fastest continuous feed inkjet presses at the time of writing are capable of printing an area equivalent to over 13,000 pages per minute, which means each page must be processed in just over 4ms. It doesn’t take much of a slow-down to start impacting throughput.
This extra load has led DFE builders to develop a variety of optimizations. Most of these work by reducing the amount of data that must be RIPped. But even with those optimizations a complex VDP job typically requires significantly more processing power than a ‘static’ job where every copy is the same.
The amount of processing required to prepare a PDF file for print in a DFE can vary hugely without affecting the visual appearance of the printed result, depending on how it is constructed.
Poorly constructed PDF files can therefore impact a print service provider in one or both of two ways:
Output is not achieved at engine speed, reducing return on investment (ROI) because fewer jobs can be produced per shift. In extreme cases when printing on a continuous feed (web-fed) press a failure to deliver rasters for printing fast enough can also lead to media wastage and may confuse in-line or near-line finishing.
In order to compensate for jobs that take longer to process in the DFE, press vendors often provide more hardware to expand the processing capability, increasing the bill of materials, and therefore the capital cost of the DFE.
Once the press is installed and running the production manager will usually calculate and tune their understanding of how many jobs of what type can be printed in a shift. Customer services representatives work to ensure that customer expectations are set appropriately, and the company falls into a regular pattern. Most jobs are quoted on an acceptable turn-round time and delivered on schedule.
Depending on how many presses the print site has, and how they are connected to one or more DFEs this may lead to a press sitting idle, waiting for pages to print. It may also delay other jobs in the queue or mean that they must be moved to a different press. Moving jobs at the last minute may not be easy if the presses available are not identical. Different presses may require different print streams or imposition and there may be limitations on stock availability, etc.
Many jobs have tight deadlines on delivery schedules; they may need to be ready for a specific time, with penalties for late delivery, or the potential for reduced return for the marketing department behind a direct mail campaign. Brand owners may be ordering labels or cartons on a just in time (JIT) plan, and there may be consequences for late delivery ranging from an annoyed customer to penalty clauses being invoked.
Those problems for the print service provider percolate upstream to brand owners and other groups commissioning digital print. Producing an inefficiently constructed PDF file will increase the risk that your job will not be delivered by the expected time.
You shouldn’t take these recommendations as suggesting that the DFE on any press is inadequate. Think of it as the equivalent of a suggestion that you should not fill your brand-new Ferrari with cheap and inferior fuel!
The above is an excerpt from Full Speed Ahead: how to make variable data PDF files that won’t slow your digital press. The guide is designed to help you avoid making jobs that disrupt and delay the printing process, increasing the probability of everyone involved in delivering the printed piece; hitting their deadlines reliably and achieving their goals effectively.
To be the first to receive our blog posts, news updates and product news why not subscribe to our monthly newsletter? Subscribe here
About the author:
Martin Bailey first joined what has now become Global Graphics Software in the early nineties, and has worked in customer support, development and product management for the Harlequin RIP as well as becoming the company’s Chief Technology Officer. During that time he’s also been actively involved in a number of print-related standards activities, including chairing CIP4, CGATS and the ISO PDF/X committee. He’s currently the primary UK expert to the ISO committees maintaining and developing PDF and PDF/VT.
To be the first to receive our blog posts, news updates and product news why not subscribe to our monthly newsletter? Subscribe here
The use of variable data has increased exponentially over the past five years and is emerging in new applications such as industrial inkjet. Yet poorly designed variable data PDF files disrupt production and reduce ROI.
Watch the recent webinar with Global Graphics Software’s CTO Martin Bailey, the author of Full Speed Ahead, a new guide to offer advice to anyone with a stake in variable data printing, including graphic designers, print buyers, production managers, press operators, composition tool developers and users.
In the webinar Martin presents an overview of the guide and highlights some of the key tips and tricks for graphic designers, prepress and print service providers, showing how, when they all work together, VDP jobs can fly through digital presses.
Sponsored by Delphax Solutions, Digimarc, HP Indigo, HP PageWide Industrial, HYBRID Software, Kodak, Racami and WhatTheyThink, the guide is a practical format for easy reference and includes:
• Tips and tricks for making fast, efficient PDF files for variable data printing
• Helpful illustrations, photos and explanatory diagrams
• Real examples from industry
Congratulations to our very own Martin Bailey and to Peter Wyatt, the general manager of CiSRA, for being nominated co-chairs of the PDF Technical Working Group (TWG) within the PDF Competence Centre branch of the PDF Association. https://www.pdfa.org/working-group/pdf-competence-center/
Following the publication of the new ISO PDF 2.0 standard – ISO 32000-2 in July 2017, the PDF TWG will be producing PDF 2.0 Application Notes to support the implementation of the standard by developers whose PDF tools create and consume PDF.
ISO 32000-2 is the first PDF specification developed within the ISO working-group structure involving subject matter experts from many countries and is the first “post Adobe” standard since they handed over its development to the ISO.
Speaking on news of his appointment Martin said, “The value of a standard can be greatly increased by a wider involvement of the relevant communities in shared education and discussion. The PDF Association has become the obvious group to help foster and guide that wider involvement for PDF itself and for many of the PDF-based standards in use today.”
Duff Johnson, the PDF Association’s executive director said, “PDF 2.0 is designed to be largely backward compatible, but older processors won’t handle new features. The purpose of the new documents that will be developed by the PDF TWG is to help developers develop a common understanding of the new specification as well as best practices for implementation.” We’re very happy that Martin and Peter have agreed to lead this effort.
Martin Bailey is the primary UK expert to the ISO committees working on PDF, PDF/X and PDF/VT. In 2017 Global Graphics Software hosted two PDF 2.0 interoperability workshops on behalf of the PDF Association to provide a way for PDF tool developers to validate their work against the new ISO 32000-2 (PDF 2.0) standard by working with vendors of other tools.
Last week was the first PDF 2.0 interop event in Cambridge, UK, hosted by Global Graphics on behalf of the PDF Association. The interop was an opportunity for developers from various companies working on their support for PDF 2.0 to get together and share sample files, and to process them in their own solutions. If a sample file from one vendor isn’t read correctly by a product from another vendor the developers can then figure out why, and fix either the creation tool or the consumer, or even both, depending on the exact reason for that failure.
When we make our own PDF sample files to test the Harlequin RIP there’s always a risk that the developer making the file and the developer writing the code to consume it will make the same assumptions or misread the specification in the same way. That makes testing files created by another vendor invaluable, because it validates all of those assumptions and possible misinterpretations as well.
It’s pretty early in the PDF 2.0 process (the standard itself will probably be published later this month), which means that some vendors are not yet far enough through their own development cycles to get involved yet. But that actually makes this kind of event even more valuable for those who participate because there are no currently shipping products out there that we could just buy and make sample files with. And the last thing that any of us want to do as vendors is to find out about incompatibilities after our products are shipped and in our customers’ hands.
I can tell you that our testing and discussions at the interop in Cambridge were extremely useful in finding a few issues that our internal testing had not identified. We’re busy correcting those, and will be taking updated software to the next interop, in Boston, MA on June 12th and 13th.
If you’re a Harlequin OEM or member of the Harlequin Partner Network you can also get access to our PDF 2.0 preview code to test against your own or other partners’ products; just drop me a line. If you’re using Harlequin in production I’m afraid you’ll have to wait until we release our next major version!
If you’re a software vendor with products that consume or create PDF and you’re already working on your PDF 2.0 support I’d heartily recommend registering for the June interop. I don’t know of any more efficient way to identify defects in your implementation so you can fix them before your customers even see them. Visit https://www.pdfa.org/event/pdf-interoperability-workshop-north-america/ to get started.
And if you’re a PDF software vendor and you’re not working on PDF 2.0 yet … time to start your planning!
In the middle of 2017 ISO 32000-2 will be published, defining PDF 2.0. It’s eight years since there’s been a revision to the standard. We’ve already covered the main changes affecting print in previous blog posts and here Martin Bailey, the primary UK expert to the ISO committee developing PDF 2.0, gives a roundup of a few other changes to expect.
The encryption algorithms included in previous versions of PDF have fallen behind current best practices in security, so PDF adds AES-256-bit and states that all passwords used for AES-256 encryption must be encoded in Unicode.
A PDF 1.7 reader will almost certainly error and refuse to process any PDF files using the new AES-256 encryption.
Note that Adobe’s ExtensionLevel 3 to ISO 32000-1 defines a different AES-256 encryption algorithm, as used in Acrobat 9 (R=5). That implementation is now regarded as dangerously insecure and Adobe has deprecated it completely, to the extent that use of it is forbidden in PDF 2.0. Deprecation and what this means in PDF!
PDF 2.0 has deprecated a number of implementation details and features that were defined in previous versions. In this context ‘deprecation’ means that tools writing PDF 2.0 are recommended not to include those features in a file; and that tools reading PDF 2.0 files are recommended to ignore those features if they find them.
Global Graphics has taken the deliberate decision not to ignore relevant deprecated items in PDF files that are submitted and happen to be identified as PDF 2.0. This is because it is quite likely that some files will be created using an older version of PDF and using those features. If those files are then pre-processed in some way before submitting to Harlequin (e.g. to impose or trap the files) the pre-processor may well tag them as now being PDF 2.0. It would not be appropriate in such cases to ignore anything in the PDF file simply because it is now tagged as PDF 2.0.
We expect most other PDF readers to take the same course, at least for the next few years. And the rest… PDF 2.0 header: It’s only a small thing, but a PDF reader must be prepared to encounter a value of 2.0 in the file header and as the value of the Version key in the Catalog.
PDF 1.7 readers will probably vary significantly in their handling of files marked as PDF 2.0. Some may error, others may warn that a future version of that product is required, while others may simply ignore the version completely.
Harlequin 11 reports “PDF Warning: Unexpected PDF version – 2.0” and then continues to process the job. Obviously that warning will disappear when we ship a new version that fully supports PDF 2.0. UFT-8 text strings: Previous versions of PDF allowed certain strings in the file to be encoded in PDFDocEncoding or in 16-bit Unicode. PDF 2.0 adds support for UTF-8. Many PDF 1.7 readers may not recognise the UTF-8 string as UTF-8 and will therefore treat it as using PDFDocEncoding, resulting in those strings being treated as what looks like a random sequence of mainly accented characters. Print scaling: PDF 1.6 added a viewer preferences key that allowed a PDF file to specify the preferred scaling for use when printing it. This was primarily in support of engineering drawings. PDF 2.0 adds the ability to say that the nominated scaling should be enforced. Document parts: The PDF/VT standard defines a structure of Document parts (common called DPart) that can be used to associate hierarchical metadata with ranges of pages within the document. In PDF/VT the purpose is to enable embedding of data to guide the application of different processing to each page range.
PDF 2.0 has added the Document parts structure into baseline PDF, although no associated semantics or required processing for that data have been defined.
It is anticipated that the new ISO standard on workflow control (ISO 21812, expected to be published around the end of 2017) will make use of the DPart structure, as will the next version of PDF/VT. The specification in PDF 2.0 is largely meaningless until such time as products are written to work with those new standards.
The last few years have been pretty stable for PDF; PDF 1.7 was published in 2006, and the first ISO PDF standard (ISO 32000-1), published in 2008, was very similar to PDF 1.7. In the same way, PDF/X‑4 and PDF/X‑5, the most recent PDF/X standards, were both published in 2010, six years ago.
In the middle of 2017 ISO 32000-2 will be published, defining PDF 2.0. Much of the new work in this version is related to tagging for content re-use and accessibility, but there are also several areas that affect print production. Among them are some changes to the rendering of PDF transparency, ways to include additional data about spot colors and about how color management should be applied.