1,500-plus file format coverage
Legacy document formats, email archives, container formats, and modern office formats are covered. Generic open-source parsers routinely miss formats KeyView handles.
OpenText • Content services
File Content Extraction (KeyView) parses 1,500+ file formats into clean text and metadata for AI, analytics, and compliance pipelines. Merito integrates KeyView into customer systems.
Merito sells OpenText File Content Extraction (KeyView lineage) and delivers SDK integration, pipeline wiring, and format-specific parsing strategy for AI, analytics, and compliance workflows requiring clean text from diverse file formats.
What it is
OpenText File Content Extraction is the current OpenText brand of the product that eDiscovery and content analytics customers have known as KeyView for decades. It parses content and metadata out of 1,500-plus file formats, handles embedded objects and container formats, and produces clean text downstream systems can consume.
The product's value is coverage and customers adopting AI, running eDiscovery, or feeding content analytics cannot assume generic open-source parsers handle the files they encounter. Legal, financial, and government customers routinely process content where the format coverage of KeyView is the difference between usable input and lost evidence.
KeyView is rarely an end-user product. Programs consume it as an SDK inside larger content pipelines and feeding Knowledge Discovery, Content Aviator, eDiscovery workflows, or custom analytics systems. Merito's engagements are integration and pipeline work, not user-facing adoption.
Ideal use cases
What it is best at
Legacy document formats, email archives, container formats, and modern office formats are covered. Generic open-source parsers routinely miss formats KeyView handles.
KeyView deploys as an SDK embedded in customer applications or as a service consumed by pipelines. Choose the shape that matches the integration pattern.
ZIP archives, email attachments, embedded spreadsheets, and nested containers all unpacked so downstream systems see clean content.
Core capabilities
Broad file format coverage with clean text and metadata extraction.
Office formats
Microsoft Word, Excel, PowerPoint, and similar office formats across versions.
Email archives
PST, OST, MBOX, EML, and similar mail archive formats.
Container and archive formats
ZIP, TAR, and nested archives unpacked recursively.
Embedded content and metadata preserved through extraction.
Embedded object extraction
Embedded spreadsheets, images, and OLE objects extracted alongside parent content.
Metadata preservation
File metadata (author, dates, revision history) preserved for downstream analytics.
Structured content output
Text, metadata, and embedded content output in structured form for pipeline consumption.
SDK or service deployment shapes for different integration patterns.
SDK integration
Native SDK embedded in customer applications for direct file-format parsing.
Service deployment
Service-based deployment for pipelines that call KeyView over APIs.
Scale patterns
Horizontal scale for large corpus processing and peak-event workloads.
Where it fits in the stack
Deployment and implementation
Licensing and packaging
File Content Extraction SDK
SDK for embedding format parsing inside customer applications and products.
Best for: Product teams embedding parsing in customer-facing systems.
File Content Extraction service
Service deployment consumed by pipelines over APIs.
Best for: Content pipelines processing diverse formats at scale.
Merito services
Merito sells licenses and the delivery work around them. Pick the service that matches where you are in the lifecycle.
SDK or service deployment, pipeline wiring, and format-coverage validation.
Explore service02KeyView pipelines integrated with CI/CD and orchestration frameworks.
Explore service03Legacy KeyView integrations modernized for new SDK versions and deployment shapes.
Explore service04Named engineer, priority SLAs, and release-time coverage for KeyView in production.
Explore service05Long-term run support for KeyView pipelines including format-coverage evolution.
Explore service06Engineering training for SDK embedding and pipeline integration.
Explore service07Merito-placed engineers embedding KeyView in customer pipelines and products.
Explore serviceFile Content Extraction licensing
Merito sells OpenText File Content Extraction and delivers the SDK or service integration, pipeline wiring, and format-coverage validation that turns KeyView into measurable AI and analytics lift.
Merito point of view
Merito has seen customers build AI retrieval pipelines on open-source parsers and find that fifteen percent of the corpus never parsed correctly. The AI could not answer questions about content the parser dropped. The fix was KeyView, not a better model. File parsing quality sets a ceiling on everything downstream.
For legal and regulated customers, KeyView is almost non-negotiable. Format coverage at eDiscovery scale is not a problem generic parsers solve. The investment in KeyView pays back on one investigation where a format-specific extraction prevents losing evidence.
For AI programs, Merito's guidance is pragmatic and start with the parser you have, measure the drop-off rate against the actual corpus, and add KeyView when the drop-off matters to business value. Skipping straight to KeyView without measurement is over-buying; skipping KeyView for diverse corpora is under-serving AI output quality.
What buyers usually underestimate
Related from Merito
Related solutions
Related services
Related products
Frequently Asked Questions
Consultation request
Share the pipeline or application where you need format parsing and the corpus scope. A Merito parsing specialist follows up within one business day.
Coverage that matters
Generic parsers routinely miss formats KeyView handles. Merito validates coverage against the customer's actual corpus.
Pipeline integration
KeyView embeds in customer applications as SDK or deploys as a pipeline service. Merito picks the shape that matches the integration pattern.
Next step
A Merito KeyView engagement measures parser drop-off on the actual corpus, then adopts and integrates where the lift is real.