Companies that have legacy codebases and are looking to speed up their digital transformation or code migration are often faced with a lengthy, cumbersome project. Code Harbor from EXL aims to accelerate this migration through the use of a generative . In addition to enhancing data and optimizing for code governance, the tool addresses the manual effort of writing and optimizing code. Swati Malhotra, AI Solution Leader at EXL, demonstrates key features of .This episode is sponsored by EXL, which drives business forward with data and AI. Learn more at exlservice.com. Contact codeharbor@exlservice.com for more information.Website:
Register Now
Hi everybody, welcome to DEMO, the show where companies come in and they show us their latest products and services. Today I am joined by Swati Malhotra – she is the Senior Engagement Manager at EXL. Welcome to the show, Swati! ? Thanks, Keith.
?
Alright, so what are you showing us today here on the show? ?
We know AI adoption is becoming increasingly strategic across industries. So today I'm going to talk about Code Harbor. It is a multiagent solution which primarily accelerates platform migration and workflow assessment. Meanwhile, it also facilitates data discovery, data governance.
It streamlines code conversion, code optimization, and also brings efficiency in end-to-end testing. ?
Generally, who is this designed for within an enterprise? I'm assuming coders, developers, those kinds of roles? ?
And not just limited to that. The solution is really geared towards business leaders as well. So as I mentioned, we work with our clients on their cloud transformation journey. Now, typically these projects are very long. They are resource intensive.
They demand huge investments, both for time as well as budget. You need people, you need developers who are experts on multiple programming languages. What we've also seen a lot of times there is insufficient documentation, of these legacy codes and applications.
So the way we have really designed Code Harbor is to cater to these different stages of migration by bringing together these multiple AI agents in a dynamic orchestration layer.
So that would help you right from code assessment, code prioritization, metadata creation, tracking, lineages, tracking transformations, doing the actual code conversion, generating test cases, debugging the code, and finally, tuning the code for performance optimization.
So it's geared towards business leaders, towards CIOs, CTOs, CDOs, who are looking to modernize their tech infrastructure, who are looking to refresh their data pipelines, optimizing their codes. Who are looking to streamline this pre- and post-migration testing landscape.
So if I have to summarize, organizations who are migrating from legacy systems to more crowd friendly, modern platforms, data professionals who are working on ETL pipelines, data transformations, QA teams who are responsible for testing, for performance analysis, and primarily CIOs and IT decision makers who are driving these digital initiatives.
When you hear the term code migration, I'm probably sure that everyone gets a big headache and they just they're thinking like, Oh, this is a multiyear project. This is going to take forever.
If they didn't have something like Code Harbor, they would be probably going through a lot of these long projects, correct? ?
Or even manual, right? I was actually speaking to one of my clients the other day. He said by the time this migration is done, I would be retiring. So yes, if enterprises don't have a solution like Code Harbor, they're looking at two scenarios.
One, they would be going with a completely manual, human driven approach, which, as you said, would take a very long time.
Or either they would be relying on these individual developer AI toolkits, which are more generic in nature and primarily geared towards enhancing day-to-day developer efficiency, rather than being really tuned for specific migration use cases like code hardware. ?
Alright, so you've got a lot of cool features here to show. Let's jump right into the demo. ?
So for the demo today, what we'll do is we're going to transform a sample ETL SAS code to Python, but it's not just a code conversion. I will also be highlighting the different assessment and governance agents. You know, we'll showcase the code conversion and optimization.
We'll generate the synthetic test data, and then we'll test the code and perform data validation. So it's going to be catering to, like I said different stages of migration. So this is a sample ETL SAS code, which I uploaded to the solution. And let's go through the modules.
So code analysis. Code Analysis really is an under the hood analysis of the source code, in this case, the SAS code, and it primarily highlights the composition of the code. You know, what are the key dependencies of the code? Are there any external libraries that are being referenced?
Are there any files or data that's being imported. So overall, it helps us to really understand the complexity of the code, and it gives an idea to the developer what's the effort involved in migrating this code? Okay, code explanation it's a very detailed summary of the code.
It breaks down the code and actually creates a pseudo code, or English descriptions of the code. And this really helps when we have developers who are proficient in one language but not the other, right? So this really creates a documentation of the code.
It saves hours and hours of pre work, and especially helpful like I said, when the original authors of the code, they're no longer around in the. Organization, and there's just a new team which inherits these codes. So this documentation is very, very powerful for them, data dictionary this.
So we know data governance is a very critical aspect of migration initiative. So using the dictionary agent, we can actually create the end to end metadata, right? It extracts the key tables, the key variables creates a description for those variables as well, and very importantly, tracks transformation logic.
So now this is actually plain English, which anybody can read, right from the business teams to the IT teams. It really breaks down the transformation logic into simple English, and it does this at every step of the code, for every variable, for every table.
Once we generate the data lineage, once we generate the metadata, what we do is we leverage that metadata to track the lineage, right? So this lineage really helps us to establish the interdependencies across these different data, assets and variables.
It tells us the parent tables or the parent variables, the source tables, source columns, and just helps us to establish or track changes across enterprise data. Now, this lineage that you're seeing today, it's just across one code.
However, this can be tracked across multiple codes belonging to the same language, or across codes of different programming languages as well. So once you know, we assess the code and we extract the metadata, the next step for us is to actually convert the code.
But actually, before we even go into conversion, we're going to break down the code into logical segments, and that's what the chunking agent will do. And this is very important, because it pretty much refactors the code.
It simplifies the source code for conversion, and that results in a more optimal conversion of the code, right? ?
Rather than doing it in one pass, one pass you've cut it into [parts]. It just makes it more efficient that way. ? Yes, yes.
It makes it much more efficient that way, it's much easier for the LLMs to also process and convert the code instead of just passing it in one go.
And these LLMs, as we know, most of these, come with a context window limitation, so it's not ideal to pass the full code in one go. So once we chunk the code, we actually leverage that chunked code to convert.
And in this case, we are converting this SAS code to Python. So this is where each chunk gets converted to a target language of choice once the code conversion is complete, the next step for us is to leverage synthetic data.
Now, very interestingly, code harbor just taking code as an input, we were able to create a documentation from the code. We were able to create the metadata to convert the code, and we leverage that metadata to generate test cases to our synthetic data module, right?
So the solution is not really dependent on any external data or any reference data to create test cases. And that's where we, again, really harness the power of these large language models.
So we leverage the metadata that gets generated in the data dictionary agent for all the input tables, and we create the synthetic data for these. ?
Is using the synthetic data for testing purposes? Is it just because a lot of companies don't want to actually go with the live data yet, until they're sure that the code is working? ?
Yes, that's a scenario, but a lot of times, you know our clients, you know they're open to sharing codes with us in our environment, but you know they would not be open to sharing their data. Okay?
So that's where we you know, whenever we are converting any codes in Excel environment, we leverage our synthetic data module to perform in house testing, yeah, before we ship the code in our clients environment, where the production data testing happens. Yeah, ?
That's another great reason. Thank you. All right, so you've got the synthetic data right? What's the next step? ?
The next step is to test the converted code on this synthetic data, and our iterative debug module will actually assist the developers in solving for errors that we'll encounter in the code.
So ?far as you can see, every section of the code is getting executed on the synthetic data.
So far there's not been any error, but as soon as we see any potential error, the assisted debugger is going to kick in, like in this case, and over a certain number of attempts, it's going to resolve for the errors in the code.
So again, it's saving hours and hours of work for these developers, and testing is a significant effort. I would say 50% of migration effort is just on testing, and that's where our iterative debug module really saves a lot of time. So let's go to the data validation.
So once the code is debugged and refactored, our final step is to compare the output that gets generated from the source code, which in this case, again, is SAS, and from the target code, which would be Python, over a series of data validation metrics like conformity, duplicity, format checks, also checking whether the output is matching, whether the column names are matching or not.
So. This really ensures that we are not just converting the code, we are not just debugging the code. We are also making sure that there is, there is output conformity as well, right?
So, as you can see here, comparing two different tables from both SAS, comparing the same table across SAS and Python. You know, we are checking the output value across various metrics, ?
And I just wrote down some quick stats from you guys. This allows three to 4x speed for digital and cloud transformation, 60% faster process for documentation, and a 20% reduction in code compute time. That's really impressive. ?
And the demo that I showed today, this was on SAS to Python. However, there are many language combinations that we can leverage code harbor for native or traditional SQL to more Cloud SQL Java, server to microservices. You know, R to Python, SAS to Python.
So there are many combinations that we are seeing with our clients that we can leverage Code Harbor for. Another thing there are different ways in which our clients can consume Code Harbor. We have integrated all of these different agents as an IDE plugin, like VS code.
So a lot of our clients are actually leveraging these agents as different modules of the VS code plugin. ?
What's the best way for a client to reach out to you guys and see a demo, or get more information, because there's a ton here that then it looks awesome. ?
This solution, it's not really an off-the-shelf capability, right? We really finetune and customize these for every client like to cater to those specific client needs. So the orchestration layer in itself needs to be fine tuned for every client. So we do offer demos.
You know, we do invest in free POCs as well with our clients. You know, to really show the value of the solution. And you know, you can always reach us at CodeHarbor@exlservice.com, ?
Swati, again, thanks again for showing us a demo. That's all the time we have for today's episode. Be sure to like the video, subscribe to the channel and add any thoughts you have below. Join us every week for new episodes of DEMO. I'm Keith Shaw, thanks for watching.
Sponsored Links