Skip to content

VerifyTests/Verify.PDFium

Repository files navigation

Verify.PDFium

Discussions Build status NuGet Status

Extends Verify to allow verification of PDF documents via PDFium.

Verifying a pdf produces:

  • A .verified.txt with the page count, per-page size (in PDF points) and extracted text, and document information dictionary entries (Title, Author, Producer, dates, etc).
  • The pdf itself as .verified.pdf. This can be omitted with ExcludePdfDocument.
  • A PNG render of every page as #page_0001.verified.png, #page_0002.verified.png, etc.

The non-deterministic fields of the pdf (the trailer /ID, the /CreationDate and /ModDate, and the equivalent XMP metadata) are neutralized so the same source document produces a byte-identical .verified.pdf across runs.

Rendering is provided by Morph.PDFium, which wraps the prebuilt PDFium binaries from pdfium-binaries (Windows, Linux, and macOS). Rendering is deterministic for a given Morph.PDFium version: the same input produces byte-identical PNGs on every machine and OS, and no image library dependency is added.

See Milestones for release notes.

Sponsors

Entity Framework Extensions

Entity Framework Extensions is a major sponsor and is proud to contribute to the development this project.

Entity Framework Extensions

Developed using JetBrains IDEs

JetBrains logo.

NuGet

Usage

Enable Verify.PDFium

[ModuleInitializer]
public static void Initialize() =>
    VerifyPDFium.Initialize();

snippet source | anchor

Initialize optionally takes the render resolution: VerifyPDFium.Initialize(dpi: 150). The default 96 dpi renders an A4 page at 794 x 1123.

Verify a file

[Test]
public Task VerifyPdf() =>
    VerifyFile("sample.pdf");

snippet source | anchor

Verify a Stream

[Test]
public Task VerifyPdfStream()
{
    var stream = new MemoryStream(File.ReadAllBytes("sample.pdf"));
    return Verify(stream, "pdf");
}

snippet source | anchor

Exclude the pdf document

Some pdf producers embed non-deterministic bytes that cannot be neutralized. For example Aspose.Cells always embeds the machine's system fonts (it has no way to restrict font resolution to a bundled set), so the pdf bytes differ from one machine to the next even for the same input. ExcludePdfDocument drops the .verified.pdf from the snapshot for that verification, while still verifying the deterministic rendered pages and info file:

[Test]
public Task ExcludePdfDocument() =>
    VerifyFile("sample.pdf")
        .ExcludePdfDocument();

snippet source | anchor

Icon

PDF designed by Meilia from The Noun Project.

About

Extends Verify to allow verification of PDF documents via PDFium.

Resources

License

Code of conduct

Stars

Watchers

Forks

Contributors

Languages