Cogapp Refinery Services
Scalable cloud-based image and data processing
Thoughtful application of automated enrichment tools
Cogapp has been building large-scale digital platforms for some of the most recognised cultural institutions for around 30 years.
We have distilled all of this experience into Cogapp Refinery Services. CRS is a performant, scalable and robust pipeline for the enrichment and presentation of large collections of digital material.
Talk to us about your data processing and enrichment plans
ContactWe employ many techniques to enhance and enrich digital images, audio and video; examples include:
- OCR of handwritten text
- Named entity recognition
- Language recognition
- Image categorisation
- Metadata extraction
For a given corpus of material, we work closely with an institution’s domain experts to select the best set of tools to apply to the material. Using this domain knowledge in combination with the chosen toolset provides a service that is greater than the sum of its parts.
Modern, scalable, serverless infrastructure
At its heart CRS is based on modern, scalable cloud infrastructure. Every job is different so CRS was developed to be highly modular. We work with our clients to understand their goals and together we develop a plan for how to deploy a pipeline to meet and, where possible, exceed requirements.
CRS uses a serverless technology stack. This allows for the instant scaling up and down of resources to match a project’s resource needs perfectly. Due to this, the system can accommodate many thousands of concurrent, parallel jobs making our pipelines incredibly performant at scale.
Outputs
The highly modular nature of CRS makes it extremely flexible in terms of output. The system can be configured to update third-party systems directly or simply output results to an agreed intermediary format for later ingestion.
Cogapp is very familiar with the world of metadata standards; we apply this experience allowing any output produced to be used as efficiently as possible by our clients.
Some example outputs offered by various modules include:
- OCR output
- ALTO XML
- Full text
- Resized and cropped image sets
- Based upon output from other CRS modules i.e. objects identified by computer vision we can custom crop to interesting objects within an image
- IIIF endpoints
- For images and/or metadata
- PDFs
- Fully searchable for text-based content
Cogapp has delivered many award-winning projects over the years. Should it be required, we can of course accommodate custom web development alongside CRS to help present your digital assets in the most appropriate and engaging way for your audience.
Automation should always start with a conversation
Cogapp has been privileged to work on high-end, high-volume applications of Artificial Intelligence, Machine Learning, Natural Language Processing and Computer Vision.
Get in touch to speak with us about how we can bring the benefits of these technologies to your projects, both practically and at scale.