How AI Happens

GRID.ai's Lead AI Educator Sebastian Raschka

Episode Summary

Joining us today on How AI Happens is Sebastian Raschka, Lead AI educator at GRID.ai and Assistant Professor of Statistics at the University of Wisconsin-Madison. Sebastian fills us in on the coursework he’s creating in his role at GRID.ai, and we find out what can be attributed to the crossover of machine learning in academia and the private sector. We speculate on the pros and cons of the commodification of deep learning models and which machine learning framework is better: PyTorch or TensorFlow.

Episode Notes

Key Points From This Episode:

Sebastian Raschka’s journey from the computation of biology to AI and machine learning.
The focus of his current role as Lead AI educator at GRID.ai.
The ideal applications and outcomes of the coursework Sebastian is developing.
The crossover of machine learning in academia and the private sector; the theory versus the application.
Deep learning versus machine learning and what constitutes a deep learning problem.
The importance of sufficient data for deep learning to be effective.
The applications of the BERT text model.
The pros and cons of developing more accessible models.
Why Sebastian set out to write Machine Learning with PyTorch and Scikit-Learn.
The structure of the book, including theory and application.
Why Sebastian prefers PyTorch over TensorFlow.
What he finds most exciting in the current deep learning space.
The emerging opportunities to use deep learning!

Tweetables:

“In academia, the focus is more on understanding how deep learning works… On the other hand, in the industry, there are [many] use cases of machine learning.” — @rasbt [0:10:10]

“Often it is hard to formulate answers as a human to complex questions.” — @rasbt [0:12:53]

“In my experience, deep learning can be very powerful but you need a lot of data to make it work well.” — @rasbt [0:14:06]

“In [Machine Learning with PyTorch and Scikit-Learn], I tried to provide a resource that is a hybrid between more theoretical books and more applied books.” — @rasbt [0:23:21]

“Why I like PyTorch is that it gives me the readability [and] flexibility to customize things.” — @rasbt [0:25:55]

Links Mentioned in Today’s Episode:

Sebastian Raschka

Sebastian Raschka on Twitter

GRID.ai

Machine Learning with PyTorch and Scikit-Learn

Episode Transcription

“SR: So if you want to develop something new, you want to, let's say, reduce the friction. You want to make it as simple as possible. And I feel like for me, personally, PyTorch does that a little bit better.”

[00:00:14] RS: Welcome to How AI Happens. A podcast where experts explain their work at the cutting edge of artificial intelligence. You'll hear from AI researchers, data scientists, and machine learning engineers as they get technical about the most exciting developments in their field and the challenges they're facing along the way. I'm your host, Rob Stevenson. And we're about to learn how AI happens.

[INTERVIEW]

[00:00:41] RS: Joining me today on How AI happens is the Lead AI Educator over at grid.ai, as well as an Assistant Professor of Statistics over at the University of Wisconsin Madison, Sebastian Raschka. Sebastian, welcome to the podcast. How are you today?

[00:00:53] SR: Hi, Rob. Thanks for the kind invitation to be on your podcast. I'm super excited. And I think, yeah, I'm always excited to talk about machine learning, AI, deep learning. And, yeah, great to be here. And I think we will have a fun time here.

[00:01:06] RS: Yeah, really pleased you're here as well. Loads to go into with you, because you just published a book back in January. It's quite technical. I'm excited to get into that with you. But also, you have this new-ish role here at grid.ai. Would you mind sharing a little bit about? Maybe we can start with kind of your background in the space and how you wound up in your current role?

[00:01:24] SR: Yeah. So that is an interesting, maybe big question here to start off with. So yeah, I recently joined grid.ai as the Lead AI Educator. But let's say starting with maybe what I've been up to before. So I did my PhD at Michigan State University in computational biology, where I focused on, yes, solving problems related to small molecule discovery, drug discovery, and so forth. And a lot of that was related to pattern recognition and also involving machine learning, and so forth.

And in 2018, when I graduated, I joined the University of Wisconsin Madison as an Assistant Professor in the statistics department. And yeah, so my focus areas have been on AI research related to, let's say, machine learning and deep learning. So AI is always, let's say, the umbrella term, but more specifically, machine learning and deep learning.

And next to research, I still am very passionate about teaching. So I created two new classes in the department focused on machine learning and deep learning. And it has been a great time at the university. I really liked my work. But I also noticed, like over the years, there are a few, let's say, limitations. So for example, the areas of AI research that I'm interested in, they're moving very fast. So, technology-wise.

And, yeah, I noticed like in an academic setting, I think if you are a strong theory person, let’s say, more like if theory is your thing, you're maybe more at home with academia. But for someone like me who likes technology, I kind of wanted to explore also, let's say, the industry perspective to kind of stay current and stay up to date. And at the same time, also for, let's say, the teaching aspect, I really like teaching. But I also noticed, yeah, the limitation is you can only have so many students in a class. And we have always this huge wait list, because people are very interested in machine learning and deep learning. And we can only have so many people each year. So in that sense, I wanted to kind of like see what we can do there and maybe explore new roles.

And at this point in time, there was like this great opportunity on grid.ai where I can basically do both. I can, let’s say, explore the industry route. But then also, at the same time, develop educational content without, let's say, restrictions, like creating. Right now, I'm working on online course where, let's say, I don't have the limitation of having people in the room. I mean, having people in the room can be an advantage. But also, I really want to, let's say, experiment with new takes on education, which is why I'm currently super excited to be at grid.ai where I have, let's say, some resources to experiment with new forms of delivery. So, yeah. So that is like in a nutshell my journey from, let's say, computational biology to teaching and doing research in machine learning in the statistics department, and then, now, my new role at grid.ai where I focus mainly now on creating educational content.

[00:04:06] RS: Yeah. So I was going to ask what exactly that means to be the Lead AI Educator. So are you putting together e-courses? Are you just trying to figure out what is the way you can reach the most amount of people? And I guess, follow up, what are the courses that you're designing?

[00:04:18] SR: So right now, I mean, I just joined a few months ago, three months ago, but it will mainly be focused on, let's say, making an introductory course first, because you have to start with a basic, let's say, foundation before you dive into more, let's say, advanced concepts.

So right now, in a nutshell, I'm making an online course on deep learning, introduction to deep learning, but also with an interesting take on, let's say, how the animations and everything will be delivered. And I'm also trying to focus on exercises, because I think that's also one important part where, while you want to learn the concepts, you have to have an explanation to start with. But then also, you want to practice what you've learned, because, really, that is how you solidify your knowledge and how, let's say, you test whether you understand things and how you also then would take examples into practice.

Personally, for me, when I work on a new project, I always use something I worked on previously as a template, because always writing things or developing things from scratch is very inefficient, and also error prone. So in that case, I hope that designing good exercises will also help people to create their own templates that they can then, let's say, use in real world application. So in that case, yeah, the focus is really an introduction to deep learning with a nice, let's say, video format, but then also nice exercises to practice your understanding.

[00:05:34] RS: Got it. There seems to be a difference between using AI and machine learning in an academic sense, versus a corporate business sense. For you, when you approach education in this way, are you enabling people to conduct research? Or are you enabling them to build products more in the private sector? What is the ideal outcome, I suppose, of your coursework?

[00:05:55] SR: Yeah. So I think it could be both. So I wouldn't say it will restrict people only in terms of doing research or developing applications in the private sector. It could be the union of both or the intersection. Depends on what you want to go for. But my focus will be, really, I'm trying to focus on, let's say, really like the latest technologies. Because in an academic setting, that's at least what’s true for me for a long time, is you focus more on let's say tweaking model, it's getting better performance, but then also applying models that say, in different contexts, when you have collaborations on different datasets, where people want to use machine learning to, let's say, get better performance than to using methods that they used before that are, let's say, the traditional methods in that field.

But usually how it looks like is you use a model, and then you train it on one computer, or one GPU, and see what performance you get. You tweak it a little bit and so forth. But nowadays, let's say, the more modern take on that is really you don't really need to do the nitty-gritty coding from hand anymore, like from scratch. There are many tools that allow you to be more efficient. For example, when you are tracking your experiments, usually, traditionally, you would put numbers into an Excel spreadsheet. But nowadays, there are many great tools for that available. I don't want to, let's say, singled out the particular one right now in this podcast, because there are many great tools out there. But they're really tools that allow you to take this work from you and allow you to kind of like go one step further so you don't have to worry about these nitty-gritty details, reinventing things from scratch. So really, how to leverage, let's say, tools developed by professionals for professionals. And then also really moving away from this paradigm where you have to run, let's say, the machine learning model on your laptop.

Because nowadays, I mean, it works for certain datasets and for certain algorithms. But nowadays, with cloud resources and also compute clusters, there are more possibilities and opportunities to really leverage, let's say, more current technology. The bottleneck is sometimes for the libraries, the deep learning libraries, to support these. But this is also something I will teach in the course, how you, let's say, with a few – One or two changes of code, like lines of code, how you get from training one model on a GPU to training a model on 10 GPUs, for example. And really helping people to scale the models and really using best practices, but then also, in a way that if you train your model, it doesn't stop there, usually, right? So you have your model. In a research context, well, that's maybe all you need. You put your accuracy into a table compared with other models. But often in practice, you want to, let's say, deploy a model. You want to put it on a device. You want to classify a new customer data or something like that. And also, how you go from the trained model to a model that is ready for production? Really, this whole pipeline from often training your model efficiently to using your model in a real-world scenario.

And I think this kind of like bridges academia and, let's say, the private sector, which you mentioned. It's like how you use, really, the latest technologies. Like, let's say, going from training machine learning models, how you did this in 2015, compared to how you would do it nowadays in 2022, essentially.

[00:09:03] RS: You know, Sebastian, I've spoken to a handful of folks now who sort of live in both worlds, in the private sector and in academia. They live on both sides of the fence. Their grass is always green, that sort of thing, which is unique to this industry as far as I've experienced. And I wonder why that is? Do you get different kicks out of academia versus the private sector? Or is it that the expertise is so rare that people at the top of this field can't help but also teach? Why do you think this phenomenon exists?

[00:09:34] SR: Yeah, it's a very good question. I think it has also something to do with the nature of what you use machine learning for. In academia, I think the focus is mostly on understanding how deep learning works. I think it's still an unsolved problem. There's a lot of research going on theory-wise, understanding neural networks, why they work in a certain, let’s say, lower level sense? Optimization algorithms and so forth.

So in that case, it's very attractive for academia because that's usually where people are really strong at, like the theoretical analysis. At the same time, there are also many, let's say, departments who are not focused on machine learning research, but they can benefit from a machine learning research. So in that case, you can help a lot of people in academia by helping them, let's say, to apply machine learning to their problem sets.

On the other hand, in industry, there's also, of course, a lot of use cases of machine learning. And then an industry usually have more resources available that you may like if you are working in a lab. And, let's say, using a computer cluster at the university is really not simple. So that case, sometimes you lack the resources. And also, sometimes there are not enough resources. So in that case, it makes the industry sector very attractive for machine learning person if you want to take your work and really scale it up and take it to the next level.

And also, nowadays, with especially these large language models, it's really not feasible anymore as a single person to work on these problems. You really need a team. And in academia, yeah, you have teams, but it is less, let's say, focused on the technology. But rather than the teams are usually formed around collaborations, I think, where really the goal is particular data analysis.

So long story short, you notice maybe I don't have a concrete answer here. But I think it's really the application of machine learning versus the theory. And machine learning is an applied field that can be studied theoretically, but it’s very fun, I would say, to solve problems with it, which is probably why people jump back and forth, like trying to understand a problem. But then also, let's say, try the thing out in practice. It's maybe – Top of my head, like maybe if you work on a race car or something like that, it's fun to, let's say, understand your race car to tune it up and make it better. But then it's also fun to take your race car onto the racetrack and your race around and have fun with it, right? So in that case, it's probably the nature of machine learning that makes this jump between academia and industry kind of like attractive to people.

[00:11:52] RS: Yeah, yeah. The race car metaphor is apt. I do appreciate that. And you mentioned resources are an aspect as well. And that makes all the sense in the world to me. And it's related, I think, a little bit to a question I had for you about deep learning, which is that there's this common cliché. Maxim, maybe you call it, that everything is a deep learning problem. Do you agree or disagree with that?

[00:12:15] SR: It depends on the context a little bit. So first of all, we have certain types of problems where we want to do predictions. And often, it is hard to formulate, let's say, answers as a human to complex questions. So if I want to classify images, to me, if I see an image of a dog and an image of a cat, it's very clear to me what is a dog and what is a cat. But if you asked me to write this up in code, like, how would I even go about this? I could say, “If this pixel area has a certain shape or color, then it's maybe this. And otherwise, it's that.” But it is a very complex task.

And in this case, I think using something like machine learning and deep learning is probably the way to go. But yeah, there are also problems where you maybe don't need machine learning and deep learning if the problem is simpler, if it's really – Top of my head, I don't have a good example. But let's say you want to classify or you want to search for something, whether an email contains a certain word or not. For that, you can just check programmatically, “Does this email contain this word or not?” You can maybe feed it into different versions, like uppercase, lowercase, with spelling errors. But for that, you maybe don't need machine learning for that.

But then the question is also deep learning versus machine learning, right? So it's also depending on how much data you have and how your data looks like, right? So in my experiences, deep learning can be very powerful, but you need a lot of data to make it work well. So in collaboration, or just in my book, there was an example where we worked with text data, and we use the same text data set three times in the book.

So first was a chapter where – So there was the IMDb movie data set for context, which is a movie review dataset where people wrote movie reviews and gave it a rating. And the task is to classify whether someone liked the movie or not. The data set has, top of my head, I think, 50,000 examples, which sounds like a lot. But while in the deep learning world, a lot means different things.

So the first chapter sentiment analysis, we start with using just a simple classic machine learning algorithm to classify whether a person liked the movie or not based on the reviews. And we got like 88% accuracy, which sounds like pretty good. But then let's say you compare it to a recurrent neural network because you think, “Okay, recurrent neural network is deep learning. Maybe this will perform better.” And I tried a lot of configurations, but I couldn't get it to be better at classification and this logistic regression classifier. I think the best performance was around 86% accuracy.

And this was a lot of work tuning it, and then you realize, “Okay, maybe 50,000, this is like the threshold, deep learning, classic machine learning.” It's kind of equal in that case. So it doesn't make a huge difference. So in the question of like whether everything should be a deep learning problem? It really depends. I think how I would go about this is really starting with a simple baseline, as something that is maybe not even machine learning. There are dictionary-based classifiers, where the classification is based on a dictionary lookup which words are, let's say, contained in a text. And if you have a certain number of words, then you classify it as one category. Otherwise, it's another category.

Then, after this baseline, I would use the classic machine learning classifier, get some performance, and then optionally use deep learning and see whether it's really worthwhile, because deep learning requires more data, but also more resources, resources in terms of compute power, which kind of also relates to how much, let's say, money you have to rent GPUs or even buy your own GPUs.

And then, of course, there's also the question of how much time do you want to spend on tuning it? So sometimes tuning deep learning algorithms can take weeks. And it is also – I mean, time is money, in a sense, right? So you could be working on something else, rather than tuning your machine learning or deep learning model.

Just to finish up this short, let's say, section on this data set, I used it a third time in this book. And there I used a large language model, the BERT model, the BERT model, which is one of the popular Dutch language models. With those, they are even more expensive than, let’s say, recurrent neural networks. And I didn't even attempt to train it from scratch, because that would be super expensive and cost a lot of money, multiple thousands of dollars. But what you can do, what's nice about these large language models, is that you can fine tune them.

So someone pre-trained them for you. You download the pre-trained model, and then you fine tune it to your target data set. And when I did that, I got actually better performance then with logistic regression classifier and the recurrent neural network. I think I got like 92% accuracy, which is, yeah, quite a lot better than the other two models.

But then I should say the caveat here was, while it was easy, it took like four or five hours on a GPU, which again, yeah, it's more work and more effort. And of course, this is also only 50,000 reviews, and a real-world scenario may have even more data, and then it becomes even more expensive to fine tune these models. So there's always a tradeoff.

And I think, coming back to the original question, whether everything should be a deep learning problem, it really depends on your objective. Sometimes accuracy is your objective. In this case, I would go with this match language model. Sometimes developer time is also objective. You maybe want to solve the problem tonight, rather than next week. So in that case, maybe the simple baseline is the solution of the accuracy is appropriate.

Sometimes you have access to multiple GPUs. Then maybe you can’t use deep learning. Maybe you want to use a very simple classifier so that you can deploy it on a mobile phone so that you can use the application also in an offline mode where you don't need, let's say, sending several requests. So in this case, you’ll need a small model. And in this case, maybe the logistic regression classifier is more attractive than the recurrent neural network or the Dutch language model. So it really depends, long story short, on your objective of what you want to try to solve and what your limitations or, let's say, conditions are.

[00:18:11] RS: Sure. There is a trend for compute power to become more affordable, smaller, more accessible. I think we can expect to see something similar with models, right? In the same way as a software engineer might go to GitHub or Stack Overflow and just like, “Oh, someone else has already written this. I can just put it into my codebase.” It's probably not there yet. But it just points toward the commoditization of data, of processing power, and of models. So if that is to continue, then will deep learning be more widespread? Will it just be because the barrier to entry is a lot lower, even though it depends, as you say, you might as well use it if it's that accessible?

[00:18:53] SR: Yeah. So that's a good question. I think if the models are more accessible, it will be more attractive to use them. However, the models still remain very large. So even though, let's say, they are pre-trained, you can download them, so they remain still very large. So if you have an edge device or something where you want to run the model, let's say, in a smart home context, where you have like a small – Let's say, your coffee machine or something. And the coffee machine, let's say, doesn't have reliable connection to the Internet, or you don't even want to have it connected to the Internet. And it should, let's say, predict the, let's say, water temperature based on the type of beans based on some sensors it has. So in this case, I think these large models, they wouldn't really work because maybe your chip on the coffee machine is not large enough to store this large model. And in this case, I think it can be a solution for many problems, but I don't think it's like a one-size-fits-all solution.

And then the question is also, many people have different data sets. And you can develop these pre-trained models for certain tasks, let's say, language classification for certain categories. But then there are lots of business problems or general problems that require really specialized datasets. And you would have to still fine tune these models. And some people also work with proprietary data. So you don't maybe want this model communicating with the Internet, or you don't want to, let's say, use someone else's platform to upload your data set. You want to have everything local, or develop things locally. And then fine tuning these models still will require a lot of resources.

So I think there are two types of problems. One is really the resource access, making resources more accessible. And let's say, coming back to grid.ai, we work on this partly. But then also, you want to make these models more accessible and easier to use, which we also work on. But it is still like an unsolved problem. We are working towards it. But I don't think, let's say, in the next couple of years, this will be a solved problem.

I think it will become easier for people to tinker with models and adjust them to their use cases. But it still will take a few years, I think, to really make this super easy. So in this case, also, if you want to, let's say, be ahead of the competition, that's another thing. You don't maybe necessarily want to use the same model everyone else is using. So you also maybe want to understand your model and to make some changes and tweaks to kind of like almost like get ahead of the competition by giving your model a unique twist or advantage. So there are many, many reasons I think why developing more accessible models is good. But I think there's also a lot of reasons why we still want to, let's say, work on developing new models and being able to understand models and tweak models.

[00:21:29] RS: Yeah. I tend to agree with you. I feel as though the commoditization of models would result in just a kind of a race to the middle, right? And if you wanted to truly innovate, then you can build a code base off of that.

Can we talk about your book a little bit?

[00:21:40] SR: Oh, yeah, sure. I would be very excited to talk about that one.

[00:21:44] RS: The book came out back in January. It's called Machine Learning with PyTorch and Scikit-Learn. And strikes me that that's a little more advanced, little more technical than perhaps some of the intro courses you're putting together right now. So I'm curious, we can get into what the thrift of what the book is about. But first, why did you set out to write it? And what were you hoping to accomplish?

[00:22:02] SR: Yeah, so that is another good, and interesting, and big question. So why I set out to write it? It is like, it comes back from my passion to actually write. And I started back then writing a lot of blog posts, back when I was a PhD student. And I really liked this process of putting my thoughts down on paper, or let's say, an editor, a text editor. So in a sense, I was also writing this book, first of all, because I like writing. But then also, I was trying to fill a niche here in terms of books out there.

So there are lots of textbooks out there, lots of books that are very good at teaching people the theory. There are also a lot of books who teach people, let’s say, how to use a certain tool. And with my book, that's kind of like how I think and how I like to approach things. I like to understand a little bit of the theory, but I also like to do things.

So in this book, I try to provide a resource that is, let's say, almost a hybrid between more theoretical books and more applied books, where they apply books that say they don't necessarily explain how methods work. They just show you how the methods work. And the theory books, they mostly explain the theory of methods, but they don't show you how you can use them.

And in my book, it's kind of like the middle in between, where I have a little bit of both worlds, because I think this is like a very powerful way of introducing people to a new concept. If you only cover the theory, I think, it's at least true for me, you will easily get bored. So for me, I always try. I try to read more theory and math textbooks. But at some point, I just dropped them because I somehow can't really motivate myself. I need to do something in between. Sometimes how I motivate myself as to code up these concepts. But I feel like, for me, at least, it's true, that I need to do things in practice to keep me motivated.

And at the same time when I read sometimes documentation or, let's say, other applied books, sometimes I see how you can use methods, but sometimes I want to know more about it. I want to know, “Okay, why does it work? How does it work exactly?” And yeah, my book, it's more like reflecting how I think or what I personally like. I like a little bit of both. Like, explain me a little bit about what is this method? How does this method work? But then also show me how I can use this method.

[00:24:17] RS: Got it. And so you spend a good chunk of time in the book just sort of espousing the virtues of PyTorch. And in particular, it's how you prefer it over TensorFlow. Is that correct?

[00:24:28] SR: Yeah. So that's a big debate, TensorFlow versus PyTorch. So personally, I started with piano that was like in 2015. That was my first book. It had like a very short section on deep learning with Theno, which was a deep learning framework back then. And then 2015-ish, TensorFlow came out. And I immediately switched over to it because, yeah, I sometimes can't help it. I get excited about new technology.

So I learned TensorFlow and was using that for my research for like about two years. And then PyTorch came out. I think it was like 2017-ish. And I didn't start using it from day one, because I was pretty happy with TensorFlow. But then, let's say a few weeks or months later, I, again, couldn't help it. And I started using PyTorch. And I felt really in love with it. Because I really liked the syntax, it looked more what people call Pythonic, to me. And also, it was this right level of abstraction. It was not too tedious to code networks. But at the same time, it was very flexible.

So for me, why I like PyTorch, is that it gives me the readability, but then also the flexibility to customize things. So for my research, sometimes I need to develop custom, let's say, neural network layers. With PyTorch, it's simpler to do that, because it was also back then with a dynamic graph, easier to debug, and so forth.

So just recently, when I worked on a research project, I developed this custom layer in PyTorch. And then someone ported it over to TensorFlow and Keras. And just looking at the code, it was just so many more lines of code. I mean, it works. And for a user, it's still just importing a single layer of function. But then as a researcher, I think it's just this extra hurdle. So if you want to develop something new, you want to, let's say, reduce the friction. You want to make it as simple as possible. And I feel like for me, personally, PyTorch does that a little bit better.

[00:26:12] RS: So do you think that there's still use cases for TensorFlow? Or when one these went out, with one of these be proven obsolete?

[00:26:20] SR: Oh, yeah. So I would say TensorFlow historically has been better at, let's say, industry problems, where the focus is more on the deployment of a model. However, I would say based on, let's say, the developments in the last one to three years, PyTorch has really caught up in terms of deployment, support for mobile devices, and stuff like that. So I feel like hearing from people, they are on par when it comes now to deployment or almost.

I must say, I don't know all the details how people use TensorFlow in production right now. But I think the moves that PyTorch has made towards, let's say, being more production friendly, I think – I mean, how to say? But I think it has been becoming more attractive also to industry use cases. And in research, yeah, if you look at recent papers, there’s like the website papers with code that compiled some statistics, most people really use PyTorch. And I can see really PyTorch winning out. Even if it's just because people do research in PyTorch, release the model code in PyTorch, and then people in an industry just adopt this code without having to worry about is it translating it to TensorFlow?

So I think I can see PyTorch becoming more and more popular. But I also, at the same time, I don't think TensorFlow will go away completely. It's always like there will be people still using it. So in that case, also, I would even almost say understanding a little of both of these tools is not a bad idea.

[00:27:42] RS: Got it. Well, Sebastian, we are creeping up on optimal podcast length here. But before I let you go, I kind of just want to ask you to reflect a little bit on what is most exciting to you in this space, whether it's explicitly related to your work or not? Is there some use case or a new technology that has you particularly excited that you think is going to be uniquely disruptive?

[00:27:59] SR: Yeah. So what I find right now really exciting has maybe less to do with the latest and greatest models people are developing, but more like the new opportunities in terms of using these models in terms of technology that makes it really easier, where, like I mentioned before, it was super hard back then to train models on multiple GPUs. This is now much easier. And this is also helping, let's say, people like me or other people, to really increase the amounts of things you can do. Because you can now leverage more resources, things go faster.

And the other thing is also applying deep learning to new, let's say, problem areas. So when I did my PhD back then, I don't think graph neural networks have been around. Or at least no one was using them. So there was always this whole dance around, “How do you get your data into the right shape for the model?” So how do you get your data into a tabular format?

And in my PhD, a lot of my work was centered around working with small molecule data where you have molecules consisting of atoms connected by covalent bonds, which you can think of graph data. And nowadays, like with graph neural networks, that's like a whole new opportunity to use deep learning for these types of problems. And that's also something I'm excited about.

I mean, it's not just graph neural networks, but in general, people focus also more like on, let's say, working on data. First of all, like getting the data into the right form so that you can use deep learning. But then also, let's say, getting more bang for the buck from your data. Instead of just feeding data into the model, there's also this movement related to data centric AI, where people are now trying to develop methods more focused on improving the data. And I think this is also very exciting, because this will help us to solve problems where we back then didn't have enough data or the data was too noisy.

So really, let's say keeping the model fixed and seeing how can we improve the data to get more performance? And I think this will have a big impact on many people and many problem areas. Because, yeah, based on my collaborations, getting labeled data is usually very expensive if you don't, let's say, use standard data. Like if you look at machine learning papers, it's mostly focused on benchmark data, like ImageNet and big text data sets, which are already labeled. But in real world problems, you have usually very small unlabeled data sets. And this move towards improving the data to get more performance from your small data set I think will also have huge impacts. And I think I'm kind of excited about that.

[00:30:24] RS: Fantastic. Sebastian, this has been a great conversation. Thank you so much for being with me here today.

[OUTRO]

[00:30:33] RS: How AI happens is brought to you by Sama. Sama provides accurate data for ambitious AI, specializing in image, video and sensor data annotation and validation for machine learning algorithms in industries such as transportation, retail, ecommerce, media, med tech, robotics and agriculture. For more information, head to sama.com.

[END]