Unlocking AI Potential: Transform Enterprise Automation with Hypercell
Organizations today face immense pressure to modernize, cut costs, and enhance productivity. The key to unlocking successful digital transformation now includes mastering accurate data processing and harnessing the power of Generative AI to optimize the automation of back-office functions at scale. Discover how the latest enhancements in our core platform, the Hyperscience Hypercell, enable transformational AI at scale—leading to tangible ROI and competitive advantage in today’s rapidly changing and dynamic environment.
In this session, Hyperscience’s Lead Product Manager, Jamie Wittenberg, Director of Sales Engineering, Rich Mautino, and CTO, Brian Weiss show you how you can leverage the Hyperscience Hypercell to deepen automation capabilities and further embed Machine Learning into the core of your enterprise.
Watch the on-demand webinar to discover how Hyperscience can accelerate your digital transformation journey.
Brian Weiss: Good morning everybody, and welcome to the Hyperscience R40 webinar. We do these on a biannual basis to keep you up to date on Hyperscience. There is a ton of really great work behind this release. We’ve got some innovations coming out they’ve been working on for a very long time, finally coming to fruition here, and we are really excited for it today.
By way of introductions, introducing myself, I am Brian Weiss, the CTO for the organization. I’m joined by Jamie Wittenberg, our lead product manager, and Rich Mautino, who is clearly auditioning for a James Bond movie with that headshot. Looking good, Rich.
By way of agenda today, the plan is: I want to give you a quick update on Hyperscience, our direction in the market, our investment priorities, and then we’ll pivot into some of the goodness in this new release. We have cherry picked a few highlights from the release. We can’t do it all. And so, look, I also see a bunch of familiar names on the call, on the attendee list, some from our customer advisory board. Hi everybody. It’s nice to see you again. But I also see many folks who might be new to Hyperscience. So what we’re gonna try and do is be both specific to the release on what’s coming in 40, but also generalize for those of you who might not be customers from learning about Hyperscience.
So kicking that off, we have just celebrated a 10 year anniversary. And to get started, I thought it would be useful to walk through the evolution of the technology, where we’ve invested and how it’s gotten us to where we are. Hyperscience was founded by machine learning experts and data scientists, two of them, who after selling their first machine learning based venture to SoundCloud—it was a consumer use case—turned their attention to taking AI and deep learning to the core of the enterprise. And they went after the most challenging problems that are caused by messy, unstructured human data. What they correctly assumed was that if you could train machines with deep learning to do something that we people take for granted, right, which is you look at a piece of paper, handwriting all over it, and you know exactly what it means instantly. If they could do that using computer vision and language models, the unlock in the enterprise would be spectacularly valuable.
So a foundational investment for Hyperscience really is the models and machine learning that reads human data at scale. Our handwriting models are the gold standard in the industry today across multiple languages. Now, the other part of that same premise is that models don’t live alone. Like handing a customer a bag of models and having them go hire data sciences people and folks to figure out how to use ’em and run ’em is not gonna work because data scientists are gonna be scarce. So they invested heavily in the premise that it should be supported by a platform which is usable by business users, not data science folks. So they invested in a low-code model management platform which enables business users, not data scientists, to quickly train up sovereign models that are built on their own enterprise data. You start with Hyperscience models and you augment them with your own enterprise data. Now, these models are highly tuned to the task. They are transparent in their processing, they’re evergreen, they’re constantly improving, and of course, secure. And the models end up functioning like digital workers. You hire them, you train them, you supervise their work, you QA it, and then you rinse and repeat, and they live on an assembly line.
Secondly, Hyperscience pioneered a very unique approach to just-in-time human in the loop for AI and automation, whereby accuracy is a brokered target inside the model. So human users get called in by the AI or the digital worker to unstick the specific points of confusion during processing. So you think of this as an accuracy harness which ensures human-level correctness for a hundred percent of the work. Now, instead of the old school way of doing this—you punt it to a black box and it spits out a bunch of data with no accountability to accuracy, so it might be incorrect, that leaves you to have to go create a QC harness and then of course, everything that it can’t run, you have to hand off to people to go fill out, but that’s outside of the process—Hyperscience changes that and takes responsibility for a hundred percent of the work, right, at human level accuracy. So what you end up doing is you take a fraction of your spend on your BPO or whatever is taking you to make it right with people, and you put those people right next to the machine at the point of processing. And not only does it unstick the model and make it go faster and more accurate, but the work you’re doing can, in the long run, help the model be smarter.
And the third premise here that the founders started and we have continued to invest in is this is turnkey infrastructure. Again, it’s not a bag of models that you work with people, right? They invested in making it cost effective and secure. So you don’t need to overspend on GPU if you can get a 99% accuracy model, narrower model on CPU. Why do that? Right? So Hyperscience can easily be deployed and run in any environment. We run in air-gapped environments right now, on-prem, Cloud, SaaS, at the highest level of security and controls for your data.
And the fourth pillar of investment early on was in the Blocks and Flow orchestration platform. The entire platform runs on an extensible pipeline. I think about that concept of a pipeline. What that means is I can combine any number of models in any order in addition to business logic steps, data transforms, validations that are required to run an end-to-end process, not just extraction of data. So if you want to make sure that the handwritten note at the top of an application about somebody’s income matches the box on page 50 of their W2 that’s hundreds of pages deep in a submission, no problem. If it doesn’t match and you wanna notify the underwriter there’s a problem, you can do that as well.
So that sort of those first foundational elements, then casting forward into the market for really the second chapter of Hyperscience. We have used that orchestration pipeline and that commitment to machine learning but driven by humans to incorporate other models, right? You can bring your own model as needed to take an ensemble approach. Now, what we’ve discovered is LLMs are super useful in summarizing intent, for example, RAG-driven questions, that sort of thing. Not so much for accurate extraction of complex data. And part of the problem there is they’re not actually accountable to accuracy. So if one of those third-party models gets something wrong, you can’t call them up and say, “Hey, I need you to fix it. I’m gonna give you a data and example of what you need to change right now.”
What we’re seeing then is great success with customers doing an ensemble model where you start with Hyperscience to do the hard lift and the accuracy, and you use maybe a third party model to take that last mile for automation. And with third party tools, the real advantage is you can bring those tools into an orchestrated pipeline without giving away transparency or control over the outcome.
We also extended the investments in turnkey infrastructure to really run at the highest security standards. By the end of this year, Jamie is gonna talk about it a little bit, we will be FedRAMP High. And fourth and finally, that orchestration layer, blocks and flows and visibility. We extended that to be able to look deep into enterprise systems, right? So connecting to those systems for two reasons. One, I wanna go get information about, say, an insurance claim that I’ve pulled from the documents, but I don’t know that customer’s ID, I don’t actually know what that’s in the claim. I can now reach an enterprise system, get that information and pull it forward to bring the data to the point of the decisioning inside Hyperscience, right? In that just-in-time, human-in-the-loop process. The second piece there is to be able to connect into enterprise systems and push data from Hyperscience down into them. The most interesting one we’re gonna talk about that of course is AI driven use cases where we’re chopping up a long form document, vectorizing that data into a vector database, and then driving, you know, an agent, an agentic AI if you will, to learn about the underlying information in the business.
Now that technology casting forward in the market has made us quite successful. We are leading the way in the category that we are pioneering called Hyperautomation. This is when you’re getting 99% accuracy. We have a long list of blue chip customers across multiple verticals, many of them in highly regulated industries where data security, PII is paramount. But we also operate in transportation, logistics, manufacturing, CPG. We partner with top tier partners. We’ll mention a couple of them later on in some profile cases. And we are very well capitalized. We’re lucky to have raised roughly $300 million. Tier one investors Stripes, Tiger, and Bessemer are the primary investors in the company now.
I will say that some of the early successes that our customers have with Hyperscience are just pure math. It’s quite simple that if you take a customer who’s struggling with 65% accuracy and lots of downstream human processes and BPO and keying work, if you take them to 99% accuracy and 98% automation, you can imagine what drops to the bottom line very quickly. And so many of our customers, that first step in that journey is immediate ROI from an AI approach to the problem of automation and understanding information. And a couple examples of that that we can talk about because these are public are in our public sector.
The first one I’ll highlight is the US Department of Veteran Affairs. We run this with our partner, IBM. It’s roughly about three years that we’ve been engaged at the VA. And at this point, we are saving them over $470 million annually over the prior state before they used Hyperscience. Every single veteran’s claim is being processed and touched by Hyperscience right now. It’s over a billion pages a year. Similarly, we work with our partner Accenture, Federal, in the Social Security Administration. This is a little younger engagement. We run about a hundred million claims per year at the SSA. Estimations about a hundred million savings annually. This past year with Accenture, we were granted an $82 million single source agreement to expand Hyperscience inside the SSA. So we have become really critical infrastructure in both cases. They are using human in the loop exception handling, that pioneering innovation from Hyperscience, as well as validation and decisioning inside Hyperscience. And I’d be remiss to say that probably the most important thing about this is at the VA, we took the processing time for a veteran’s claim from a matter of three months down to a matter of days. So it’s really hard to quantify the high value that has for our veterans.
There’s some other things in addition to our success in the market that are really interesting. In the last few months, we’ve been recognized by analysts across the board as leaders, which is super satisfying for those of us who’ve been working so hard on this. But I mean, really what it means is that the bets that we made as a company in AI-first in infrastructure and a platform for sovereign models to do these workloads were right. And so, you know, it’s kinda like that joke about “it took us 10 years of hard work to become an overnight success.” But the market is really moving our direction. Between Forrester and IDC, GigaOm was last week, and buyer’s guide, things like that, we see leaders in capability consistently. Now, we’re particularly proud of the Forrester Wave. That’s not actually an IDP thing, that’s actually core technology for document mining and we’re a hair’s breadth in as a leader in strategy right behind a company called Google all the way to the right there. So we’re thrilled to be sort of landing in the market where not only our customers are getting value, but also analysts are seeing that our bets are the right ones.
And there’s a couple of trends in there that I’ll point out, and then I’m gonna hand that off to Jamie to talk about R40. The first one is that most analysts are saying that, okay, the hype cycle around AI has been really exciting, but everybody’s sobering up a little bit and one of the key corollaries is there will not be one giant model to rule them all, right? I mean, there’s gonna be an ensemble approach to workloads for AI, and that the real gold mine is inside the enterprise, right? It’s data that is not public and that you need to secure. And so what every analyst is saying is the value will be in smaller sovereign models which are run and controlled by organizations who need to have security and models that are accountable for accuracy inside their environment. So they’re saying exactly what Hyperscience has been paddling toward for the last 10 years.
A couple other trends that are coming out of those reports that are very interesting for us: 82% of organizations are still dependent on some paper based process and manual routing of tasks via emails, right? So that second piece is very interesting for me. And 90% of data in the enterprise is still unstructured. By the way, that number hasn’t moved in 10 years. It’s amazing. What that means is that as we do the easier stuff, more harder use cases come up. But those two things tell me that this is a very ripe market for an AI-first approach with governed sovereign models that help you manage the security and risk. It also tells me that the RPA market has really kind of failed, that really the old school ways of thinking about automation with BPO, et cetera are not working.
And then the other stat in here that’s very interesting is either 42 or 56%, depending on which side of the pond you’re on—we could talk about that Brit versus Americans, like who’s more risk averse—but across the board, it’s the concern for privacy, security and governance around data. So the more interesting Gen AI is, the more concern people have about: “Where’s my data? How’s my model being trained? Who can see it? What’s the provenance of that? Can I control that?” And so all of those workloads by definition are shifting on-prem. They’re shifting back or at a minimum to somewhere where the customer controls the models, controls the data. So all of those things are really just nothing but support for Hyperscience’s approach in the market. And we’re super excited for to land some of the investments on these trends exactly, on these trends.
So, with that brief introduction, I wanna hand it off to Jamie Wittenberg to tell us a little bit about what’s coming in R40, some of the specific advancements. Jamie, take it away for us.
Jamie Wittenberg: Thanks, Brian. I certainly will. So let me make sure I understand what you’re saying. So enterprises have mountains of unstructured content across a variety of sources, including paper and digital documents, and they need to get that data out of that unstructured content and structured content as well into business systems and systems of record in standardized formats at a high accuracy rate with very little human intervention without devoting entire engineering and data science teams to that. Is that what you’re telling me?
Brian Weiss: Nailed it. Exactly.
Jamie Wittenberg: Well, with the latest version of the Hyperscience Hypercell, that is no problem. So bringing AI to the enterprise for automation is not simple, right? For starters, one popular model repository has over 1 million models available. How do you even choose the right one for the particular job that you wanna do? Whether that’s classifying documents, extracting tabular data, extracting unstructured data, transcribing data. Then once you’ve chosen the model, how do you annotate your dataset to feed that into the model, to train it? Once it’s trained, how do you gather your data from all of the various physical and digital sources, run the model on it, get the output of the model into your business process or your system of record, measure the accuracy of that model, manage drift, manage data privacy, manage access? So there’s a lot to be done just to get a single AI automation use case running and Hyperscience’s entire mission is to make that a turnkey operation for the enterprise. This is what we’ve been doing for 10 years, and with V40, it just gets better.
Today, we’ll cover some exciting advances in V40 that increase the range of your business’ content that you’re able to automate, how we’re making it even easier to manage your model lifecycles, and how we’re increasing transparency and traceability of all your AI processes. I wish we could talk about everything coming in V40 today, but we’ll focus on some highlights. So let’s get into it.
Our first area of innovation that we’ll cover today is how we enable enterprises to unlock critical insights from dense content, from unstructured content—that 90% we’ve been talking about—with our new long form extraction model. Like we said, 90% of new business data is unstructured, and this data is much more complex to extract than a legacy OCR platform can manage. These documents can be very long, they can have inconsistent formatting and contextual ambiguities. You can’t just look at an unstructured document and immediately tell what it is by the visual format, unlike an invoice or a tax form. And the content in those documents is very nuanced. A document might say “evidence found” or “no evidence found,” and you need a specialized model to pick up the difference. These documents might have complex hierarchies and nested structures like a lab report or a pharmacy formulary. You might be dealing with multiple occurrences of the same information, such as multiple policyholders or multiple interactions between a provider and a patient.
There was a lot of hype and experimentation around LLMs that some folks in the industry thought might crack this nut, but unfortunately, LLMs are really general models that are not highly tuned to this type of problem. And so we took all of this into consideration when building our new Long Form Extraction deep learning model. This model easily identifies multiple occurrences of the same information in the same unstructured document. You tell the model, “Hey, find me all the policyholders in this proof of insurance,” and it will do just that at the 99.5% accuracy rate that we’ve provided in the past with our structured and semi-structured extraction capabilities.
Speaking of structured and semi-structured, these documents tend to be quite a bit shorter, right? For example, a W2, an American tax document, is literally one quarter of a printer page, and an invoice is usually like a couple of pages long with data points and clauses that are pretty short. And we have purpose-built models for those use cases. On the other hand, unstructured documents can be hundreds of pages long with data points and clauses that are much longer than say, the income box on a tax form or an item description and invoice.
So for that reason, we built our long form extraction model to handle documents that are up to 200 pages long and to extract clauses and data points of virtually unlimited length. This model can find data on your page without using the visual formatting clues of an invoice or a tax form, and it doesn’t rely on visual clues to do that. So bring us your longest, densest documents and let the Hypercell get whatever data out of them that you need to review applications or claims, audit transactions, or really whatever paper-based business process you wanna automate. So for example, this model will prove useful across a variety of use cases in banking, in legal and insurance, and even the public sector.
Keep in mind that the process to set this up is to just upload a set of documents to the Hypercell, use our training data management capabilities to annotate those documents in the UI. Business users can do this, by the way, you don’t need your data scientists or specialists or your services team for this process. Then you import a workflow that Hyperscience is built, connect it to your data sources and turn it on. And with that, Rich is gonna show you how we do it.
Rich Mautino: Thanks, Jamie. Morning everyone. Go ahead and just pull this up here. Really excited to show you this new capability because it solves a problem that we’ve been hearing more and more about. As we press deeper into digital transformation with a lot of real world use cases, we wanna be able to identify and orchestrate more than just a single occurrence on a document. So more and more organizations have been asking for help with complex, lengthy documents like contracts, warranty deeds, medical records, and others where the fields we’re looking at may occur more than once.
Let me double check the sharing setting once again. Thanks for that. So with these complex documents—contracts, warranty deeds, medical records—we might have fields where they’re occurring once, they may occur several times, and they may occur not at all. So what I’ll show you is a submission here where I’m gonna go ahead and initiate the submission. And while that’s running, I’ll let us have a look together at what’s actually running.
So you can see this is a confidentiality agreement and we might have some parties here, some dates, terms, we may have a jurisdiction and whatnot. And we’ve got it spread across a five page document here. So when I let this run, I can show you kinda what’s under the hood while we let it go here. So in this case, I’m gonna look at it and you can see where we enable this new capability. At the bottom on the right, you see Long Form Extractions enabled. This is our engine type, and this is about a hundred times larger of a field ID model than a traditional approach with the generic one. And it’s all muscle. And we see that when we get the benchmarks out of this because it consistently outperforms across the board.
Now, when we look into this layout here as well, we can see the fields we’ve asked Hyperscience to help term. You can see effective date, party, and in this case here, “Multiple Occurrence” has been checked, meaning we might have none of these occurrences, we might have just one, and there may be several. Same with jurisdiction, and we can toggle all of this effortlessly. So it makes it very easy to get very custom where you may have different combinations—you may have in a medical document two dates, but three symptoms for example.
Rich Mautino: So coming back into the submission and taking a look at how this is going, you can see what we’ve done is identified it and we’ve also identified the fields. In this case here, it’s found all of the fields we’re looking for, and it’s in the process of transcribing ’em. So very quickly, we’ve been able to identify multiple parties here. This is our multiple occurrence in this scenario here. So you can see parties one, two, and three have all been identified here. And Hyperscience in real time is in the process of transcribing it as well. So it will actually on a highly automatic basis go through and grab everything we’re looking for.
And you can see while I’ve been speaking, we’ve actually transcribed all of these. So the effective date’s been found here. I’ve got three different parties that have all been found. I’ve got a jurisdiction, and more importantly, I don’t have a term here. So multiple occurrence also gives us the ability to move forward when maybe there isn’t a term at all, rather than getting hung up. And when I come back to this submission, you can see in about a minute’s time, we’ve achieved a hundred percent automation. We grabbed a hundred percent of all fields and a hundred percent of all transcription. So we’ve preserved both the high automation and the high accuracy and done so on a complex lengthy document.
Brian Weiss: Rich, that is awesome. Thank you. And so look, one of the things that we’re seeing in our customers is the desire to break up long form documents in order to be able to drive maybe a Gen AI use case, an agentic use case, and Jamie’s gonna tell us a little about that.
I do wanna take a quick audible, however, I’ve got a question from Andy Clarke. He’s asking, “Is the deep learning model for long form extraction still run on LLMs?” It’s not run on LLMs at all. So let’s make no confusion. This is a Hyperscience model, right? You are using Hyperscience models and building on top of them with your own data. When we use LLMs, they’re always ancillary to the core problem that we’re solving in extracting data. So this is not a wrapper around some LLM. This is actually a Hyperscience model that you train on your data. And we’ll punch out to LLMs for very specific questions, but by default we do not use them on the back end.
Um, Jamie, can you tell us a little more about how customers are using this to unlock Gen AI experiences?
Jamie Wittenberg: Yeah, absolutely. So this is actually the whole reason why we launched the Hypercell for Gen AI. So many enterprises are eager to harness the power of generative AI, right? But are unsure how to do so in a way that is compliant, that is scalable, sustainable, and most importantly, accurate. Like I said earlier, Gen AI is powerful, but it’s not purpose built for these type of use cases.
So for example, we know that some LLMs have hallucination problems, others have segmentation problems. Using Hyperscience’s proprietary long form extraction model, which is a deep learning model, it’s a neural net, and our transparency and governance features, which we’ll cover later, you can overcome some of the shortcomings of these all-purpose Gen AI models. So for example, instead of submitting an entire document to a Gen AI model and prompting over and over again to get your desired output, what you can do with the Hypercell for Gen AI is take the long form extraction model, get the data points that matter to the question that you’re trying to answer and send just those to the LLM. And the reason that this works so nicely is reducing the amount of data submitted to the LLM helps overcome both the segmentation problem by reducing the number of segments that the LLM has to review and maybe saving you some tokens in the process, and increases the accuracy by reducing the context window.
So you could put this to work in all sorts of use cases, but one is for example, fraud detection and insurance, right? You can take just the transactions, you can scrub any personally identifiable information and submit just the transactions to the LLM and ask the LLM to see if those seem fraudulent. You can do a similar thing in financial services to ask if these transactions seem to be risky. So to sum this up, Hyperscience’s purpose-built models and overall AI automation infrastructure can help your business actually get the most out of an LLM and overcome some of those problems that we are seeing that is built in there.
Brian Weiss: Awesome. Jamie, let’s shift gears a little bit and talk about that actual platform itself and some of the innovations we are bringing out to streamline model lifecycle management in the Hypercell.
Jamie Wittenberg: Yeah, so we’ve talked quite a bit about our brand new unstructured extraction capabilities and how you can use them, right? But in this section we’re gonna talk about the infrastructure and operations required to stand up use cases and manage your models—what Gartner’s been calling Model Operations—and how the Hypercell provides model management capabilities right in the same simple business user interface as the extraction capabilities, and how this can help you maintain your automation and accuracy rates as your business needs change year over year and as we launch exciting new features for you.
So as I mentioned earlier, there are a bunch of exciting new features coming out in V40. We’ll dive into four of them in this section. We’ll talk about extended map model compatibility, we’ll talk about layout ID matching for structured forms, field level accuracy targets and document drift management, which Rich will also demo for you at the end of this section.
So as I mentioned earlier, it is our mission to be a turnkey operation. That means that we don’t want it just to be easy to get started and we don’t want it just to be easy when you’re in production. We want it to be easy to manage upgrades and other operational changes. Hyperscience releases new features several times a year, and we want you to be able to upgrade your Hypercell and enjoy those enhancements without a ton of operational overhead. So one thing that we did in service of this commitment was decouple our model and flow versions. Now they’re forward compatible with more versions of the Hypercell. So you can get all the great benefits of upgrading the Hypercell without retraining your models, again, meaning you don’t need the data scientist to help you with that upgrade.
Next up, we have Field Level Accuracy Targets. The Hyperscience Hypercell lets you specify accuracy targets on a field level rather than at a document level. Accuracy targets determine how often the machine raises its hand for help from a human—that just-in-time human in the loop that Brian mentioned earlier—and to an extent how much infrastructure the machine uses. Many of your documents will have fields with varying levels of criticality. For example, social security numbers and dates of birth are more impactful than whether a customer has elected to receive, say, digital versus paper communications, right? So the Hypercell enables you to set a high accuracy rate for those fields that you can’t afford to get wrong, targeting your human resources and your infrastructure where they matter most, while still enabling you to capture those less critical information pieces at a lower accuracy target. Without sacrificing automation or efficiency. With a different platform, you might be choosing to forego collecting those less critical but still valuable fields to avoid disruptions to your process or human exception handling. But with the Hypercell, we don’t have to choose and you can continue to tweak and tune right in the UI, again, saving you services dollars.
Speaking of accuracy, it’s paramount to correctly classify documents to their layouts in order to extract the right information from them and kick off the correct downstream processes. If your documents happen to have unique identifiers on them—something we see pretty commonly in government forms, insurance, financial industry—we can actually force a match between documents with those layout identifiers and their appropriate layout template without writing any custom code. So right in the Hyperscience UI on layout variation template, a business user can type the identifier into the template definition and then matches will be automatic. This increases both automation and accuracy at the same time, which is pretty cool. Without additional human in the loop resources or messy rules based automation. This is especially useful in context where you might have hundreds or even thousands of nearly identical layout variations that maybe only differ by a word or two between regions or quarters. And we can make sure that you classify those to the correct layouts the first time and ensure that we extract the right data and that data gets into the right business process downstream.
Earlier I mentioned upgrades, but that’s not the only potential source of operational overhead. When managing any software platform, your organization will probably experience new layout variations on a regular cadence, sometimes unexpectedly. To handle this case, we’ve launched a feature we’re super excited about called Document Drift Management, also known as Layout Triage. This is a feature that helps you incorporate new documents into your workflows that your Hypercell instance has not seen before, whether you’re setting up a new Hypercell installation for the first time, or you’re incorporating new documents as your calendar or fiscal year changes over. So this means that you don’t need to know every form that might hit your Hypercell workflow to get up and running. You can start with a few typical forms and as less common forms come through, they can get routed to the Layout Triage experience, which again allows business users to quickly create a layout for them, resubmit the documents that came in that were unmatched right from the UI and get them back into your workflow without involving data engineering or IT. Rich, let’s show them how we handle previously unseen forms without training a new model or disrupting ongoing processing.
Rich Mautino: Sounds great, and this one’s really timely. It’s hard to believe we’re almost into November and a lot of organizations somewhat dread this time of year because it’s a mad scramble to figure out which forms are gonna change. And sometimes these changes happen outside of our control. So as Jamie mentioned, a lot of times we don’t find out about these changes until after they start coming in. And then it’s a huge effort to try to catch up, train new models, build new layouts, and adjust for that “drift” as we call it. So we’ve made this very easy and we’re excited to release it in time for the new year to turn over.
So in this case here, I’m gonna submit five packets and what I’ll do is show you what we’re submitting here. It’s a four page application for New York State enrollment here. So when I go ahead and submit this here, you can see I’m gonna run it through as I would normally, and we take a look at the submission and it runs through. Say we’re in… say we’re into the new year, and it runs through and it says, “No variation found. I don’t recognize this.” So the example I like to give is: your friend got a new haircut, you haven’t seen him for a few weeks, and they knock on the door. Rather than form a new friendship because this new human being has a new haircut and training models from scratch, we can just quickly recognize that it’s the same person we’ve always known, they got a new haircut, mentally catalog that and move on.
Rich Mautino: So in this case here, no variation been found. So we simply come into the Unmatched tab and we can see the submission. The same one we came in has 20 unmatched pages. So we can run a grouping job here. And what that’ll do is go ahead and run a grouping effort on it and we can see that’s completed already. We’ll run into our page groups here and we’ll see if we have a total of four groups. So you remember from the document I showed you, there’s four pages and there’s five of each of these packets. So we’ll go ahead and triage all of them together.
And you’ll notice here, you don’t need to be a data scientist in order to work this. We’ve made it incredibly easy for a business user to rapidly triage, make the adjustment, and get the operations back on the road. So much so that actually we provide these step-by-step instructions here where a new user who’s unfamiliar with this, for example, simply follow the steps and everything works automatically from there. But in this case here I’ve got my first group and I’m ready to triage it. And you can see here, I see at the bottom right hand corner, this is page three of four, and I see in the bottom it’s a New York 2024 form. So I know, okay, we’ve got a new version of this form. It’s a little bit different maybe than the 2023 version of the form. Maybe there’s a new block, maybe something’s changed.
So rather than starting over, all we do is come in here and select the New York application. We give it a new code 2024. And in this case, this is a structured form, so select that. And with the index here, you can see this is page three. So I simply enter page three and now I give it a blank form and that will go ahead and start triaging it. So you can now see that group has moved up into the top left hand corner. So one triage, three to go. And all we have to do is follow the same exact process and we can move very quickly. You can see everything’s point and click here. We see at the bottom here, this is page four. So very rapidly we can triage this. And in this case here, two down two to go.
Rich Mautino: So same thing here. We still the New York document and we have the ability to look at it and make sure that we’re not just autopiloting too much and giving bad drift accountability. But in this case here still ’24, looks like this is the first page. We save that, upload the blank version once again and move on to our last group here where we simply click through. We look at the bottom right, this is page two. So we select page two here, save it, feed it the blank, and upload it. So now you can see all four groups have been triaged. We have nothing on our to-do list, and it’s indicated in the top left with that green icon there that the triaging is complete.
So now we can view potential layouts to go ahead and fully triage this. So in this case here, we’ve got a potential layout. This is the New York application in this case here. So we’ll go ahead and click the play button, we’ll continue there and we’ll go ahead and load this model in. So I’ve named it 2024 Drift for the purposes of this demo here. And this is a new layout that’s being created all with just a couple clicks. So in this case here, I’ll go into my library and you can see right at the top here that is now created. And all I have to do now is follow the same process I would over any other, I would create a release. I’m gonna go ahead and add that new release here. I’m gonna select that layout from the release. So go ahead and select it once again here, choose it, create it. And now I simply need to assign it to that same flow. And we’ll be ready to resubmit this and get very different positive result this time. So I’ll assign it to the flow here, same flow as before, “Drift Management R40 Flow.”
And now that that’s done, you can see in a matter of a couple minutes, I’ve triaged this packet completely and I can come into the same submission. I started here 2609, and all I have to do is resubmit it here. So in this case here, I resubmit this and I’ve got a new submission ID that’s just popped up. And now instead of processing and having 20 unmatched pages, you can see as the workflow goes through Hyperscience orchestrates it, the Hypercell correctly identifies the variation here, it identifies it’s a New York application. Here we’ve got 20 document pages, now five documents in total. None of ’em are unmatched. And it’s running through and it’s now complete. So this same submission now, we’ve effortlessly without being a data scientist, resubmitted it, triaged it, and achieved a hundred percent automation on that transcription, all right here in the interface, making it very easy to adjust for this drift. And you can imagine at scale, this is incredibly powerful. So we’re really excited to have this ready, for deployment going into the new year.
Brian Weiss: Awesome, Rich. Uh, now look, I wanna point out one thing. You didn’t retrain any models, did you?
Rich Mautino: No. Absolutely not. No need to.
Brian Weiss: No, you didn’t. Right? I know there are a ton of folks on this call are gonna be really thrilled about that. ‘Cause the incremental changes in forms driving, driving them bananas. So enable to be able to get very quickly back to high performance. You saw what Rich did there, folks who did it in a matter of minutes was able to incorporate those variations without having to retrain models.
Jamie, let’s pivot a little bit now. This is not as demoable, it’s not as exciting to show these things, but it’s really critical to our customers, which is around Governance and Security. So let’s talk about what’s in R40 that enhances our profile here for the folks on the call.
Jamie Wittenberg: Yeah. So we’ve been talking about this, right? Hyperscience provides a layer of security and governance right in the Hypercell so that you can know that you’re using Hyperscience within your company’s data and access compliance policies. So we believe there are three things that you need to run an AI program securely and compliantly. You need auditability, governance and transparency. Maybe you’re concerned about adopting AI because of the black box problem. You need to know who made what decisions when, and if there are mistakes, where did they come from? You’re also trusting us with your most critical data and we make it easy for you to monitor and restrict who has access to what.
So in pursuit of FedRAMP High accreditation, which we are targeting to achieve by the end of the year, we’ve made a number of updates to the Hypercell to meet the stringent security requirements necessary for that accreditation. And most of these updates will actually be available to all of our customers across installation types. That’s on-premise, private cloud or SaaS. With V40 amongst many, many updates, we’ve made significant improvements to our Audit Logging feature set in terms of both coverage and content. Just about every click made in the software is now captured, and there’s an enhanced version of audit logging that customers can actually turn on to get a little bit more data about each of those clicks. This feature set is available in the UI and over API, so that if you want to extract these audit log records into a third party system where you do your internal compliance activities, that’s really easy to set up as well. Brian, do you wanna actually tell them about our overall operational stance on security and governance?
Brian Weiss: Yeah, I mean, look, this is a large but also somewhat incremental push for us to increase our security profile. We already operate at the absolute highest standards. We work in very rarefied environments with incredibly sensitive data. It’s government data, it’s personal identifiable to the highest degree. So whether you’re on-prem or whether you’re operating in SaaS, you’re getting that level of concern for security governance and frankly, it also, you know, AI trustworthiness in working with Hyperscience. That is a real, it’s foundational principle of ours.
What’s great about this last piece of work for us is in pursuing FedRAMP High, not only does it allow us to break into the government market in a different way—we’ll be able to offer our own solution in the cloud—but all of the those requirements trickle down, they filter down to our on-prem customers as well and our maybe not so regulated customers. So really for us, what that means is that you can rest assured you’re getting the highest level of concern for security governance and AI transparency with Hyperscience going forward, and that will only continue. So super excited about that. But it’s just a basic principle that we continue to invest in as we move along.
Look, this has been… we’re coming down to the end to the webinar here, and so I wanted to do two things. Thank you all for being here. It’s really hard for me to figure out what I like the best. Like is it Layout Triage and Drift Management? Is it Field Level Accuracy? Is it the innovations of Long Form we’re doing? And I’m not gonna ask y’all to vote right now because we don’t have that much time, but what I do wanna do is let’s go down to the Q&A and maybe there’s a couple of questions that have been kicked up here. I recognize some customers coming in. Let’s jump in and we’ve got a few minutes here.
Jamie, why don’t… we’ve got a question about I’m gonna assume this is about Layout Triage from James… this is whether it’s replacing Layout Finder. And then there’s a one further down around the upper limit of fields and long form. So let’s just let you wanna hit on both of those.
Jamie Wittenberg: Yeah. So this first one here is: “Is the layout triage experience replacing the potential layout finder?” Yes. And we think it is, we think it’s a really great replacement. Then we also have the second one that you mentioned was on the upper limits on the number of fields in the long form. No, 200 pages, as many fields as you want. That we are super, super excited about that.
Brian Weiss: Let’s see, what else do we wanna get into here? Kevin, you asked, hi Kevin, good to see you. You asked about whether this particular feature is only available on unstructured layouts. Can you tell me which feature you were referring to? You popped that in the chat or in the Q&A, so I can answer that one.
Yeah, and in the meantime, Shane Reeds, you’re asking a question of “whether with the privacy concerns, do we still see a decent number of customers wanting to install on-prem?” The answer is yes. And particularly in regulated markets, like there is an increasing concern when you’re working with PII and you’re working with any type of privacy concerns to manage models yourself, right? And rightfully so. Like when those third party models are not owned by you, you don’t understand how and where and why they’re gonna be updated at what points and what the training. So there is still a lot of concern. And many of these AI workloads are coming On-Prem. We of course live on-prem for a large portion of our customer base. And if you go SaaS, you know, you’re gonna get the same level of security with Hyperscience. So the answer is, yeah, we do see it particularly in regulated markets.
Rich Mautino: Brian, I have a question here. The question on the increasing of the page limit. So 200 isn’t a hardcoded rule, that’s a general guideline for a normally equipped Hypercell. So depending on the complexity of your documents, it may actually be able to handle more than that. So what we’d encourage is to take a deeper dive on our one-off meeting and take a closer look. But in short, there are some performance abilities to increase that limit. So we just want to take a closer look at that.
Brian Weiss: It’s not a hard limit at 201. It’s not architecturally limited. It can be as long as you want, but of course we want to be appropriate to the size of the Hypercell that you’re working with and your expectations on performance.
Jamie Wittenberg: Alright, moving on. We’ve got one here from Kristen Highhouse at the SSA. Kristen, you asked “if there is a built-in semi-structured field ID model for W2s.” So W2s are actually a highly structured model, so you can use our structured classification and field extraction for those. So with the similar to the layout triage experience that Rich demoed, you can upload a W2 point and click the fields that you want, assign it to an IDP flow without actually training a model.
Brian Weiss: Yeah. And layout triage helps with the variations. I think what you’re going after there is do you… some customers in the past would’ve built a semi-structured model to maybe catch the variations that are coming out. So what layout triage allows you to do is achieve that level of flexibility and be able to add those very quickly to a model without having to think about a retraining process.
Rich Mautino: It’s also worth noting that we’re not finished here either. This is an area of significant investment and we’re gonna be continuing to stratify and enhance these, and we have our customers to thank a lot of these inputs and ideas we get, we’re constantly looking to enhance and add to. So what you’ve seen today, we’re just getting started. We’re really excited to kind of continue to double down on these lines of effort.
Brian Weiss: Yeah, thanks. Thanks for raising that, Rich. I mean, I see customer names that I recognize and one of the things I appreciate about coming to Hyperscience—I’ve been here about a year and a half now—is the active engagement with our customer base. At the edge of our innovation… between our customer advisory board and we are daily working with folks to help solve the nitty gritty of the problems that they’re working with. And we are well-funded and nimble enough to be able to pivot and work to some very specific use cases as well. And one of those is we have many, many customers coming at us who want to do something, the Gen AI… they wanna train in on the language of their business, and they’re trying to figure out how to do that responsibly in a way that’s accountable to accuracy. So that’s another example of places where we’re collaborating actively with our customers and in some cases also hyperscalers.
Look, I think we’re at time. We’re about three seconds from time. So, look, for my sake, thank you very much for spending time here. We’re super excited about this release. And we’re… Hyperscience is in a really great spot and what we have invested in for the last 10 years is proving to be exactly what the market needs and wants. We’re thrilled with the innovations and the attention that Gen AI is bringing to the space of AI in general and customers, and the types of innovations that people are thinking and trying to do with Hyperscience and beyond.
So, I wanna thank ’em on behalf of myself and Jamie and Rich, who will be featured in a James Bond movie, I’m sure if we can get that headshot circulated properly. And any other questions in the chat, we will be tracking you down and make sure we get the answers out to you directly. Thanks again for spending 45 minutes with us today, and we will see you on the next one. Take care.