Welcome to No Priors. Today, we're talking to Tengu Ma, assistant professor of computer science at Stanford and the cofounder and CEO of Voyage. Voyage trains state of the art components for next generation retrieval systems, including embeddings models and re rankers. We're really excited to talk about his research and the rag debate today. Welcome, Tango. Yeah. Thanks so much. Thanks for having me here. We're looking forward to the debate. Yeah. Why don't we start with, just a little bit of an overview of, like, your research agenda to date? Because I I think uniquely, it covers, a broad range of fields, within and around deep learning from, like, theory to RL to embeddings and optimizers. So can you talk a little bit about, sort of how you pick the directions you have? Yeah. So, I think most of the papers I wrote have some theoretical thinking in it. I guess maybe that's the commonality. And besides that, I think I worked on quite a few topics as you mentioned, ranging from the theoretical understanding mathematical proofs of deep learning systems, to, all the way to practical large language models, reinforcement learning, deep reinforcement learning. And these days recently, I think what we are working on, more centralized to, efficiency of training the large language models and the reasoning improving the reasoning tasks for large language models. So I my vision is that in the future, the efficiency is very important because, we are running off of data and compute. So we have to either use the data much better and use the compute much better. And also reasoning task seems to be a pretty, important direction and, and also in some sense, kind of like a a risky direction in the sense that we don't know exactly how, how fast we can, solve those challenging reasoning questions yet. Mhmm. Can you, mention a few of the, like, key papers or work that, you know, you or students in your lab have done just so our listeners can look them up? In the very, early days, I think I worked on some of this matrix completion optimization for matrix completion. That's like 10 years ago. And then I move on to, embedding models, like sentence embeddings, vector embeddings. One of the papers we wrote is a very actually simple paper where we average the word embeddings to get sentence embeddings. And then we did some, of these transformations using PCA to make the performance much better. That was even before transformer came out. And then I think I move on to, transformers, large language models, and contrastive learning, which is the new way of training the embedding models, especially the the, the direction started with some of the papers on, using contrastive learning for images, and we work on improving those and understanding why contrastive learning can work. And recently, we work on optimizers for large language models. For example, one of the paper, we wrote last year was Sofia, which we found where we found that we have a new optimizer, which can improve the training efficiency by 2 x for pre training. This is great. Adam is very old at this point. Yeah. He's 10 years old now. I think that's the interesting part about it. So optimizers, you know, I think people have tried in the last 10 years for so many times, there were so many papers published, which has, improvements over item in various cases. But so far, item is still the default algorithms for training large language models. And that's why we thought that it's the time to, really, we spent a lot of time on this. Like, I I think I started probably around 2018, 2019. And I asked a few students to work on this, and finally, we had one paper out after a few years, after a few failed projects and failed ideas. And and recently, I think one of the Facebook, friends, actually used this in their large scale multimodal training and they found that in on that scale, I don't know exactly how many parameters they are, but I I assume it's kind of more than a 100,000,000 parameters. They found that on that scale, they are, there is a 1.6 x improvement in the efficiency of the training. So that's like a $10,000,000 versus $16,60,000,000. That's super exciting. Yeah. I I think, you know, Sofia has an opportunity to be really, really impactful. You started a company last year taking leave from Stanford. Given your work has been, like, theoretical, but with practical applications, like, what drove you to do that? I think I came to Stanford partly because there's a, very strong industry connection here at Stanford compared to some of other universities, and and also probably entrepreneurship is just part of my my career plan, anyways. And, in terms of the timing, I felt that this is the right timing in the sense that, the the technologies are more and more material so that it seems that the commercialization is the is the right timing right now. So, for example, I think, one one story I have is that, you know, I I look up some of my, slide deck for my, lectures at Stanford CS 299 7 years ago, when I started to teach at Stanford. At that point, machine learning, we have a lecture with Chris Ray and the machine learning on applied machine learning. So how do you apply machine learning industry? And there are 7 steps there. So, the first step is you define your problem, the second step is you collect your data, and you choose the loss function, you train it and you iterate so and so forth. So it's pretty complicated at that point. And now the foundation model, arise to power and and in a new foundation model era, the only thing you have to do is that you have to, you know, someone will tune the foundation model for you and then you tune a prompt and you add a retrieve augment generation on top of it and that's pretty much, that's it. So applying machine learning AI to an industry environment is much much easier than 7 years ago. And that's why I felt that this is probably a right time to commercialize many of the technologies because the technologies are more mature. Yeah. This is actually, I mean, a core premise even for the investing fund that I started a conviction in that, you know, somebody's doing the bulk of the work for you in a more general way. And so the application of, of AI and industry is just much much cheaper. Right? Because you only do, you know, the last few steps and or different set, but last few steps in essence. So maybe you can talk about, like, the the, you know, just given wide range of research, the problem you focus on with Voyage that you saw with customers. Yeah. So with with Voyage, I think we are mostly building, these 2 components, rework and embeddings for improving the quality of the retrieval or the search system. So the reason why we focus on this is because we talk to so many customers and we found that, right now, for implementing rack, the bottleneck seems to be that, you know, it's it's not very hard to implement it, where you can just connect the components and have your rack system ready very quickly. But the bottleneck seems to be the quality of the response, and the quality of the response, is heavily affected or is kind of almost almost bottlenecked by the quality of the retrieval part. If the large language model, see very relevant documents, then they can synthesize very good answers, even like a llama 70 b can do that very well. Can you just give, like, a sort of general intuition for, like, what a RAG system is and some of the applications of it? Yeah. So I guess just a a little bit of background. So, retrieval of my generation, the idea is that there's a retrieval step, there's a generation step. So the main point here is that if you just use a large language model as a black box, you know, as is, then the large language model wouldn't know anything about, the proprietary information inside the company. And and it doesn't know enough context about the use cases. And, the the the the retrieval of machine learning stack is about you first retrieve some knowledge from, for example, inside a company and then, use this knowledge and give the knowledge to the large English model so that the large English model can generate or synthesize a good answer without any hallucination. This has been found to be very, very useful to reduce the hallucination rate. And so there's 2 steps. The first step is to retrieve some relevant information given the query, and then this relevant information are given to the large English model. The retrieval step, is important because, once the large English model sees the relevant information, it can reduce the hallucination rate dramatically because it use the the relevant information as an anchor to refine the answers in some sense. And what we are doing here is that we want to improve the quality of the retrieval or the relevancy or accuracy of the retrieved documents and information. And the way to do this is that there are 2 steps. The first step is that you vectorize all of your documents or all of your knowledge base. So you turn the documents to vectors, you turn the videos into vectors. You turn your code into vectors. Code into vectors, everything into vectors. And so the vectors are the representations of each piece of the knowledge or documents and all the other indices. So and then you put these vectors into a vector database and and and then you search, the relevant information using the vectors as indices. Where are you seeing, RAG applications today? Like, what are what are customers building or, you know, what are the most common systems? Yeah. So we have a a lot of users and, and they are all over the, places. I think, you know, we have even a customer who is a chemistry company who is building this rack system to understand their chemistry documents or products in descriptions. And I think just it's almost everywhere, like finance, legal, code retrieval, code generation, so on and so forth. I think it can be applied to almost any cases and also for even for individual users where you have, you know, a lot of, like, individual personal information, and you wanna have a rack system on the phone so that you can access your past information, much in a much more easy way and, and you wanna retrieve for example, we all have seen that, you know, when you search your, documents on your laptop, it's actually pretty hard. You have to use the exact file name. It will be much easier if, this search can be semantic based. Mhmm. Rag is a relatively, new architecture. I think your your average, enterprise technology leader had not heard the term before the last year or so, and it became popularized in a researcher for the last few years. But there is already a debate, I I think, you know, in terms of opinions from people at, different large labs and in academia about whether or not, you need a RAG architecture to work on proprietary data. And just to sort of describe some of the alternative views, I I think there's kind of 22 alternative points of view given. One is a sort of agent chaining architecture where you are inputting your data and knowledge, you know, chemistry code, law, finance, whatever documents, into a series of LLMs that just operate with instruction on it, for example, to summarize or categorize it, or you simply feed everything into LLMs with, infinite context or actively managed context versus explicitly, vectorizing anything. And so I would love to get your reaction to, you know, that as a as an alternative to RAG. Actually, there was also a debate last year about RAG versus fine tuning. And I think that debate was kind of like getting to a a consensus now. It sounds like rack is much easier than fine tuning, and and fine tuning, in many cases, doesn't work because you need a lot of data to see the results and there are still hallucinations even after fine tuning. And now as you said, the debate becomes right versus, agent changing or, long context. So, maybe let's talk about long context first. So I think, there are probably two answers to this, you know, from different angles because long context right now is not practical yet. Right? So we have to kind of anticipate what long context transformer can do and then do the debate at a future time in some sense or anticipate the debate at a future time. In the near term, I think the long context transformer where you just put in all the proprietary data, 1,000,000,000 tokens into the context of the transformer. So, will be very, very expensive. Right? If you use the price right now, it's gonna be just impossible, to do it. It's probably, like, 5, 10 magnitudes of difference depending on how many documents you have in the context. Of course, you know, you can bring the cost down, by for example, one approach is you cache the, the activations of all of the internal operations of the documents you put in the context. So that will bring the cost down, by a lot, but I think still if you do the calculation, theoretically, it's still much more expensive than, Wrike. So I think that's the the more practical answer. So in terms of cost, it's gonna be much more expensive than Wrike because you have to save all of this, activations or intermediate computations in the GPU memory, most likely, or maybe in CPU memory, of all the, all the, you know, 1,000,000,000 tokens context. Mhmm. You know, you may argue that, okay, over time, everything will becomes, you know, cheaper and cheaper, but, Wrike will be cheaper as well, right, because many of the the technologies under Wrike are new network based and the GPUs will become cheaper, then new networks will become smaller. So my prediction is that RAG will be much cheaper than long context, going forward. And another way to think about this is that maybe just on the first prince from the first principle, right, so my analogy of long context is that so in some sense, the context is the short term memory in some sense. Right? And the rag is more like long term memory in some sense. Mhmm. So the question is, you know, for example, you know, when you answer any question why you have to go through the entire library every time, right, like, put all of the entire library in your in your short term memory for answer a single question. Right? It sounds like the right approach should be that for every single question, you retrieve some, a subset of the information and use those to answer the question. That seems to be the most efficient way to do that. It should be some kind of hierarchies in some sense, in terms of how we solve the problem so that we can get the best efficiency. You know, even when we do the, the computer architecture, like, the the hardware stuff. Right? So you have a different level of caching. Right? So you have disk, you have, CPU cache and so forth. So so in that sense, I feel like, the the more hierarchical two level kind of, like, system like rack is more cost efficient. Yeah. I mean, the analogies certainly make sense. I think there is another thread of discussion of, like, what does long term memory for LLMs look like that where, you know, it is something managed by the LLM itself, but I do not think that is a well answered question, and, like, RAG may just be a part of that answer. So the the the embedded model that we rank are in some sense the large language model that are managing the long term memory. Of course, there might be variants and other ways to manage the long term, memory, but I think it will be somewhat similar. It's gonna be like more, you know, the technology always evolves, right, gradually, right, so maybe 2 years later, Voyage or maybe other companies will have a new version of the long term memory, which is based on, you know, embedding models but, you know, kind of like extending the embedding model in some way. That's that's entirely possible. Yeah. I do think it's useful to sort of contextualize for people who, are not working with, sort of data sources for LMs at scale every day. Like, what sort of token limitations are. Right? You know, we go from a few 1,000 tokens to something like Gemini 1 5 Pro, the code context window of a 1000000 tokens. Right? And so if that's short of that in word count, that's maybe 5 books or, like, 25, 30000 lines of code, and obviously, like, limited amount of video and audio. And so I think the ability to make reasoning decisions on more than that amount of data is obviously going to be needed. And then the questions are are to to me are really, like, you know, does does efficiency matter both from a cost perspective and a speed, like a latency perspective. Right? And how much can you push the context window? And, like, you know, does hallucination management matter? And so I I I think there are lots of arguments for, like, RAG being very persistent here. Yeah. Yeah. Exactly. And just to add a little bit on that. So, 1,000,000 tokens, 5 books. Right? So but many companies has a 100,000,000 tokens. That's a 100 x difference. Right? So a 100 x, you know, in for cost is a is a big difference. That could be just a, you know, a 100 k dollars versus, like, $10,000,000. Right? $10,000,000 is unacceptable, but a 100 k sounds okay. Yeah. I think that's probably what's gonna happen. Like, so so from because at at least for many of the companies, right, so right now if they have a 100,000,000 tokens, I don't think they can use long context transformers at all because it's way too expensive. Right. And I'm I like, the simplest thing for me is actually for a system to look at the entire code base or some representation of the entire code base versus the portion of it that could fit into context today. Yep. What about the other piece? Like, the idea of agent chaining and using LLMs to manage the data, in in that form. So agent training, this is a, you know, growing area and many people are doing research on it. I think this is sort of a less well defined in some sense. On the the first level bit, I would say, is that I think it's kind of orthogonal to embedding models and revanchors to some degree because even when you have agent chaining, right, you still probably use embedding models as part of the chain. Right? You probably do iterative retrieval, as part of the chain and, of course, you use large language models as part of the chain as well. In some sense, it's orthogonal direction. So I probably rephrase agent chaining as more like a iterative, multi steps, retrieval augmented, large language model augmented, system. So, and some part of this retrieval probably is done by a large English model, sometimes part of the system is done by a small large English model, and some part of the system is done by embedding model, so and so forth. So in that sense, I feel like it's somewhat kind of orthogonal. Yeah. And I I I feel like, some of the motivation for agent chaining to begin with is the same efficiency motivation as RAG. Yep. Exactly. Right. But if you use a very, very large language model to manage to manage the system, the knowledge system, I think you again lose the efficiency. Right? So so it has to be a somewhat kind of like smaller model to manage the, the knowledge. And then at that point, embedded model might be the right thing to do in that agent training framework. Maybe another angle to look at this is that whether we should do iterative retrieval versus, just retrieve at once. I think iterative retrieval is definitely useful, especially because now there are still a lot of headroom, in the, embedding model's performance. So that's why sometimes you have to retrieve multiple times because the models are not collaborative enough. However, in the long run, my suspicion is that, iterative retrieval will be useful, but it will be a bit less useful as the if the embedding models becomes more and more clever. Right? So, once the embedded models are more clever, then maybe one round or two rounds is gonna be enough. Mhmm. If we go ahead and then just assume that RAG is at least a dominant architecture for enterprise use cases where you care about proprietary data that is large with reliability. How do you go about improving, like, a RAG system? Right? You can improve the LLM itself, but what are the what are the other components that you guys are working on? Or what are the maybe challenges from the user, the builder's perspective to improve retrieval quality? Yeah. So I guess there are a few ways. Right? One way is that you improve the prompting of the large language models. So, for example, you could tell the large language models to abstain if there's no relevant information, in the in the retrieved documents. But because the large English model are so good these days, I think you'll need a lot of prompting anymore. It just responds to those instructions so well. And then the next thing is to improve the retrieval part, which is the bottleneck in my opinion because most of our users found out that if they improve the retrieval quality, directly that affects the response quality. And improving the retrieval part, I think there are 2 ways. One way is you improve the embedding model. One way is that you improve some of the other things on top of that. For example, how you trunk the data, whether you do iterative, retrieval, whether you put in some of the meta information in the data, so and so forth. So, basically, I would say, there are 2 2 ways of improving. One way is to improve the new networks, either the new the embedding models or the rewackers or you improve the ways to use new networks with software engineering. Right? Better trunking, iterations, or other kind of like heuristics or kind of like tricks on top of that. So and what we are specialized in is that we want to improve the new networks because that's requires a lot of heavy lifting. That's in it's a very data driven approach, which win our neural networks on trillions of tokens, at least, and we fine tune them for special use cases. And this is something that probably what a company should do instead of, like, every the users, the end users should optimize themselves. And my long term vision here is that some of the software engineering layers on top of the networks will be, less and less needed when the networks are more and more clever. So, for example, right now, we already see that trunkings becomes less needed because the context window becomes longer and longer. And the long context embedding model that no. Relatively long context embedding model. Long context here means, like, 10 k, for example, maybe 16 k so that you can put out 50 pages PDF into it. Because this long context embedding model becomes much better, there's less of a need to trunk the documents into pieces of, like, 5, 12 tokens. And I think this will happen, you know, in other dimensions as well, right? So maybe in the future, you don't have to turn your images into description images and then give it to the text embedded model. That's what people are doing right now, like, everything is turned into text and then use a text embedded model. But when the embedding models are more, clever and multimodal, then you don't have to do that anymore. Mhmm. Can you talk a little bit about just, like, the intuition for how fine tuning or domain specific embeddings improves performance? Yeah. Fine tuning and domain specific embedding models are what we are very good at at Voyage. So just to have some context here, so, what we do is that we start with a general purpose based embedding model, which is also what we trained from scratch. And from there, we, first fine tune or continue pre tune whatever you call it, on, some domain specific data. So for example, we, fine tune on 2 trillions of code snippets tokens and that then we get the code embedding model. And we do the, fine tuning on 1,000,000,000,000 legal tokens, and that's how we got the legal embedding model. And this domain specific embedding models, I didn't use any of the proprietary data so that everyone can use them, but they really excel in one particular domain, and the performance in other domains are not, changed much. And the reason why we do this is because the number of parameters in the embedding model is limited. So, because, you only have, like, you have a latency budget, something like maybe one second, sometimes, like, a 200 milliseconds. You know, some people even want 50 milliseconds. And then, basically, it's impossible to use more than 10,000,000,000 parameters for embedding models. And we have limit parameters. Any customization is very important because customization means that you use the limit number of parameters on the right tasks, the right domain so that you excel in that domain. There's no way that you can use this 10,000,000,000 parameters to excel in everything. So that's why you have to specialize in one domain. And we have seen, like, 5 to 20% of improvements by this domain specific fine tuning, depending on the the particular domains. For code, we have seen, 15 to 20% of improvement partly because we have a lot of data there. And, and the headroom there is also bigger because code retrieval requires a lot of deep understanding of the algorithmic part of the code. And for legal domain, the, the the baseline is a little better, so the the headroom is still slightly smaller. So that's why we see 5 to 15% improvement depending on the datasets. For some of the very complex legal datasets, we have seen, bigger improvements. Just to make sure that, our listeners can picture exactly, like, where the latency cost is coming from here. In in a search system, like, your data, you know, has been vectorized by an embeddings model, but then every query also needs to be translated into an embedding and then compared to the embeddings of your knowledge in order to feed that that LM for the generation that you want. Right? And and so there's a so there's inference time latency here as well. I I just think that's not obvious if somebody hasn't built a rack system. Yeah. Exactly. Exactly. So basically, at at the infirst time, you have to first turn the query into vectors and then do the search with respect to database. And actually, relate to this, the dimension of the vectors you produce also affects the latency for the vector, based search. If the dimension of the embedding is, like, a 100, it's only a 100, then it's gonna be much much faster than when the dimension of the embeddings is a thalent. So and actually, this is something we are very good at as well. So we produce embeddings that is like a 3 x, you know, 4 x smaller dimension than some of the competitors. Yep. That makes, I mean, intuitively, you are creating embeddings models that use a limited number of parameters and, dimensions, just given the sort of latency budget, that you, that any application has to create the best possible representation of proprietary data or domain specific data. Yeah. Exactly. And going back to the, the domain specificity and and fine tuning. So the second level of, customization is that we can customize to a particular company. Right? So we fine tune on the proprietary data of a particular company, and we can see, 10 to 20% improvement on top of the domain specific in fine tuning, as well. So, of course, there's a total budget in terms of how much additive improvements you have there. Right? So if you start with, like, 50% accuracy, then you only have 50% headroom. If you start with 90%, you only have 10% headroom. So so the the improvement, the absolute improvement varies a little bit across the domains. Maybe just advice to people who are building rag systems. At what point do they begin to invest in, you know, some of these retrieval components? Yeah. I think they can do it even from day 1 as long as they have a prototype, you know, available. Right? So, basically, my default suggestion for our users is that, when they have the RAG, you know, first of all, of course, you want connect the components and at least see some response. And then, probably do some kind of, like, basic, profiling in terms of the latency and the quality. Right? So you can check the retrieval quality, meaning that how often you retrieve relevant documents. There are some, default ways to evaluate the retrieval quality, and then you you also do the end to end evaluation, for the responses, and then you can see which part is the bottleneck. And in many cases, people found that the retrieval quality is not good so that the final response is not good. And then you can swipe some of the components. You can say, I'm gonna try voice embedding. You I can try the voice rerancers, which we haven't discussed too much about. So, and you can, try various different embeddings and and possibly various different large language models as well. Maybe just, zooming out, like, you you know, you start by saying, in order to have the debate about RAG versus alternative architectures for working on proprietary data, you need to predict forward. Right? Any predictions for how these systems change as LLMs improve dramatically, right, if we look at the next generations of OpenAI and, or GPT and Claude and the Mistral models and LAMA and such? Yeah. So my prediction is that this the the system will be simpler and simpler. Maybe this is my biased view. So or at least this is something that we are working towards. So the idea what would be that it's a very very simple system. So you just do you just have 3 components like large English model, vector database, and embedding models, and maybe 4 components another re ranker, which refine the Redshift results. And you connect, all of this and each of the the new artworks does add everything else. You don't have to worry anything about trunking, multimodality, changing the data format, because new artworks can do most of them. Right? So 7 years ago, if you talk to any of the language so called language models 7 years ago, you have to turn the the format into a very, very clean format. And, now you talk to GPT 4, you can have typos, you can have all kind of like a weird formats, you can even dump JSON files to it, right. So the same thing would happen for embedding models as well. So my vision is that in the future, AI will just be that a very simple software engineering layer on top of of a few very strong neural network components. Mhmm. Yes. I think the bias toward, it is actually all going to be AI versus complex, you know, discretized software systems is is clear, but, I believe directionally right. Maybe zooming out to just get a little bit of your perspective as a founder, like, you know, what's one or two top learnings you have about starting, the company as a, as an academic before even, you know, despite your work with Google and other companies before? Yeah. I think, it's very, very different. Founding a company is very different from doing research at Big Tech and, and also even from actually, it's a little bit closer to being academia because to run a university lab, I'm the CEO, CTO, CFO, and and and HR for for the university lab. Right? So you you you you touch on a little bit of everything, but at a slightly different scale. Right? So I think one of the biggest thing I learned actually is from one of our Android investor is that I should read some of the books. Even those I think for probably experienced entrepreneur, many of the books are very basic. But for me, they are very, very useful, when I read some of the even the basic books, including Eli's book, by the way. So but his book is a little bit, advanced in a sense that his his book is talking about how to scale from 10 people to a 1000 people. And I only read a few chapters of that because we are about 10 people right now. So, yeah. And also talking to a lot of angel investors, talking to, Sarah and, my other lead investors. So, so I think all of this, helped me a lot in, reducing the enforced mistakes, in this process. To me, I think it's really about how to reduce the number of errors you make so that you can maximize the efficiency. At least this is what what happens to me. And also how to correct the mistakes as fast as possible. Right? If you could correct mistakes every, 1 week after you made them, versus, like, 1 month after you made them, then, that's a 4 x efficiency improvement. Mhmm. Very theoretically consistent in your, you know, vein of research. Last question. You know, you have been personally productive, productive research lab, but you've started a company. What do you think the role of academia in AI is in in this age of, like, scaling? Because most of your former students, like, they essentially all work at OpenAI or Anthropic with, you know, a few professors and Citadel folks in the mix and the ones working with you. Yes. Yes. In academia, this is a little bit controversial topic. I think different people have different views. My view is that I think academia probably should work on some different questions from what industry is good at. Right? So if we are just only working on how to scale up the system, then obviously, the incentive is not right. The we don't have enough capital there and, you know, even OpenEye, I guess, Stub Ultimate argues that, you need a lot of capital, to to start to do this in some sense. So, you know, like, at the very beginning, I think the the point is that, you know, you first have it cannot be non profit because, if it's non profit, then you don't have enough capital and you cannot scale up enough. I think I kind of agree with that, and that's why in academia, it's very hard to scale up and have enough resources, to do the large scale research. However, I think in academia, there are many, many other things that we can do on a smaller scale, and we probably should focus on more long term innovations. So what I told my students at the lab is that we should think about what will be the, the breakthrough in 3 to 5 years as opposed to how do you help OpenEye to, to improve their, large language models in the next in GPD 5. So so that's why we work on optimizers, which is, like, 10 years, old op the item is a 10 years old optimizer, and we say, okay. That sounds like a long term project. Maybe in 5 years, we can improve the optimization efficiency by 5 to 10 x. That's gonna be, a game changer, for the whole landscape. Right? So if we improve the efficiency by 10 x, I guess that's like a $100,000,000 or $10,000,000, for training GPD 5, then, I think that would change the landscape a lot, in the industry. So, so efficiency is one of the things I spend a lot of time on. Another thing is that there's reasoning tasks. I think the reason why I identified that as one of my, labs direction is because it's challenging and it requires a lot of very innovative research. It's very unclear whether you can really the scaling law is really enough to get get you to prove Riemann hypothesis or any of the math conjectures. So, you know, and also you have to be superhuman performance in some sense. Right? So if you turn on just the the the common crowd data on the web, can you be a good mathematician? It's kind of very hard to believe that. So so we need more innovations there. So that's pretty much what we are doing at the university lab. We try to work on those 3 to 5 years, agenda, and and on a smaller scale. Mhmm. I think that's an inspiring note to end on and and, like a a very open minded one about what is still to be figured out. Thanks so much for doing this, Tango. Thanks so much. Find us on Twitter at no priorspod. Subscribe to our YouTube channel if you wanna see our faces. Follow the show on Apple Podcasts, Spotify, or wherever you listen. That way you get a new episode every week. And sign up for emails or find transcripts for every episode at nodashpriors.com.