Scaling Data Operations With Platform Engineering – YouTube Dictation Transcript & Vocabulary

En iyi YouTube dikte sitesi FluentDictation'a hoş geldiniz. Etkileşimli transkript ve gölge okuma araçlarımızla C1 seviyesindeki videoyu ustalaşın. "Scaling Data Operations With Platform Engineering" parçalara ayrıldı; dikte çalışmaları ve telaffuz geliştirme için idealdir. İşaretli transkriptleri okuyun, anahtar kelimeleri öğrenin ve dinleme becerinizi geliştirin. 👉 Dikte Alıştırmasına Başla

YouTube dikte aracımızı kullanarak İngilizce dinleme ve yazma becerilerini geliştiren binlerce kişiye katılın.

İçindekiler

Transkript & Vurgular Ana Kelimeler Dilbilgisi & Telaffuz Zorluk İstatistikleri İndirilebilir Kaynaklar Alıştırmaya Başla

📺 Click to play this educational video. Best viewed with captions enabled for dictation practice.

Etkileşimli Transkript & Vurgular

1.[Music] hello and welcome to the Data Engineering Podcast the show about modern data management data migrations are brutal they drag on for months sometimes years burning through resources and crushing team morale AI powered migration agent changes all that their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches and they're so confident in their solution they'll actually guarantee your timeline in writing rated to turn your migration into weeks visit data .com/datafolds today for the details your host is Tobias Macy and today I'm interviewing Chakravarti Kotaro about scaling successful data operations through standardized platform offerings and being able to provide databases as a service at scale so Chakravarti can you start by yourself sure thanks for having me i'm Chakraarti Kikotaru i work as a director of data platform for a leading online travel company and I have nearly two decades of experience with scalable especially in data stores data governance and security i have spent a big part of my career focusing on building data platforms and public and private clouds and do you remember how you first got started working in data it was right out of recession i applied for like 1200 internships i got two one is from Walt Disney World Florida that was like a really great experience so I started as an Oracle developer and uh after that I joined a major insurance company as an uh database developer but then the project got crapped and my manager at that time gave me an option of you learn operations database administration or go find another job so I was like okay I will learn whatever it takes so I started picking up uh different NoSQL databases at that time I specifically remember I started uh on React which is bankrupt now but it's similar to Cassandra but that's what gave me opportunity to explore different NoSQL databases in addition to relational databases and that's where we started building private the data platform in the private cloud in terms of the idea of databases as a service data platforms obviously a lot of the cloud providers have that as one of their offerings with generally the focus being I want a database to be able to use for whatever application I'm building i'm wondering if you can just start by giving some overview about some of the ways that those database as a service offerings that are part of that core offering from the cloud providers are maybe insufficient and some of the challenges that you ran into as the person responsible for that data layer and how that led to different failure modes for the teams that you were supporting yes so uh our philosophy is mostly like you know uh like we don't want to select a few databases and have use cases designed around them we want to specifically have a use case which solves a problem and figure out what is the right database for it so we don't want to put everything in relational or put lynam or put everything in a select databases that you are comfortable with but if you are doing a search if you are doing full text search we want to go and get the best best search engine for that which is elastic search so specifically targeted databases for specific applications that will give you the best you know overall return of your investment and best use case experience so that's that's one reason why you use multiple databases another reason is obviously you know big companies mergers and acquisitions different companies use different things and then when they come together it's not straightforward overnight to change everything in terms of the different options for database types database use cases you mentioned things like Elastic Search and Dynamo DB obviously there are various relational databases that are on offer and in terms of the particular data use cases that you were supporting i'm wondering if you can give a bit of an overview and some of the reasons why you think these various NoSQL offerings were the most relevant and and effective options for the problems that you were looking to support right so the major ones that we support are like relational we have a lot of SQL server that we were trying to migrate to a cloud uh native database but that is traditional like you know 10 years back mostly everything is relational so we have a lot of tech dep on it mostly transactional databases for payments and bookings those kind of things and as we break those monolith and create more microservices and we transition to the cloud we are exploring opportunities to you know what can be moved to MongoDB so that we have you know better uh decoupling the schema and giving more flexibility to developers at the same time cache use cases uh heavily on radius which we are actively working on to you know migrate to more cloudnative databases especially with all the license changes that are going all over the open source community so if you look into low latency rights like anytime we are working around use cases which really requires low latency rights we prefer Cassandra or SQL DB a key value uh database like that uh so there are different things to keep in mind when we are selecting these databases obviously the cap theorem what is important to you is it like consistency or or partition tolerance based on that uh we pick and select a database for the best use case and in terms of the scale of challenge that you're supporting in your current role I'm wondering if you can talk to some of the ways that the existing platform offerings were starting to run into challenges or some of the edge cases that you dealt with or some of the ways that the teams were maybe not using those systems to their best effect and the ways that you've thought about how to make that more of a standardized offering so that it was falling off a log easy yes uh so it's multiple things so the platform offering is one for scale and quicker adoption another is for governance and uh making sure all the best practices are followed so when we initially started like let's say one org or one section of the company that we were supporting when we started looking into databases few things are happening with a devops culture everybody started building up their own MongoDB but when developers want to set it up they set it up with an intention to quickly develop their code and go to market they don't pay too much attention to is the database has the right parameters does it have you know authentication does it have SSL and all of all of this very deep you know nitty-g gritties of the databases so when we started looking into it we found things like databases doesn't have basic authentication sometimes uh they don't use the right parameters for that specific access patterns so what ends up happening is either they 6 months after that they came and they come and say MongoDB is not the right use case because I'm not able I'm not getting the performance I needed or they will throw infrastructure at it and 10x the box to shield performance issues which will increase the cost so to solve that problem uh because it was one organization we started using infrastructure escore terraform and EC2 and all of that and came up with a basic platform which creates all of these databases it it was a simple implementation of a data platform because it's only like four different AWS accounts okay we have a Terraform plan uh template for MongoDB a Teraphform template for Cassandra and once the developers use that it creates that specific cluster for example Cassandra it creates a six node cluster with all the right DC configurations and everything so it shields the developers from all the nitty-g gritties of databases that itself is a right value a value add right away uh which saves their time and at the same time it was able to give a lot of like governance benefits for the company because now I have authentication as default now I have you know multiDC as default in case of disaster recovery and things like that but the scale it it was good it worked but then you know at some point the company started consolidating all of the platform so we went from one or which is four AWS accounts to almost 400 AWS accounts which is like just blew out of proportion and we realized that the current solution won't work so we started looking into you know a a data platform that that can work at scale that's when we started looking into service catalog and different uh cloud providers and uh with service catalog we use that hubspoke model where we were able to solve that scale issue now it doesn't matter if it's 400 accounts or 4,000 accounts and whether it's like 100 different database clusters or 8,000 clusters we were able to centrally deploy manage and uh you know monitor all of them at scale in terms of the teams that you were working with what were some of the characteristics of the other engineers and their level of familiarity with these data layers and the level of attention that they wanted to pay to that aspect because I know from working with teams of various compositions developers they just want to say just give me something that I can throw data at i don't want to have to care about the rest of it whereas if you're dealing with data engineers they're going to be much more hands-on about selecting the actual storage layer technologies and the ways that they're interoperating we prefer the second one because like the one on part of this platform building and everything there is also a consultancy service right we like to be involved from day one where you have a use case the data side of folks get involved so that we can figure out what is the best database technology because there were instances where the P started and because before you realize the P ended it it was productionized and by the time you figure out that this is not the right technology maybe you can move from radius to like you know some other database it's already too late so because you are committed to release a product and you go to production with that And just by changing the backend database in some cases we were able to save like $2 million so so it's very important to get the right database technology right so as far as what kind of mix we have we have we're considering a big travel platform travel uh online travel company we have almost all kinds of teams like some teams they're really good at what they do and they have been managing some of these databases for a long time so they know all the integrity they are like hey we don't need to onboard to the data platform we are good managing it ourself we respect that so it's not that the whole point of data platform is not to impose something on everybody it's to help them unblock and move them quickly uh to reach their goal so but majority of the company like 90 to 95% of the company they were like wow this is great i don't have to worry about infrastructure i don't have to worry about 24x7 support or performance tuning you take care of that i am happy writing code so majority of that uh uh that's what I have seen across the industry as well whenever I go and present this uh data platform related issues most of the developer community is happy offloading that to somebody else and they focus on you know building the next tiny thing and so in terms of the options and offerings that you were building out to support those different styles of team I'm wondering how you thought about the base case of I just want to be able to throw up a database and be able to start working with it and then having levels of complexity that you're able to expose for the teams that wanted to have more control and fine-tuning of the size or scale or throughput of the different data layers yes so uh it's obviously like one size doesn't fit all so whenever we are providing these options there are multiple parameters there will be an intake form where uh they will be able to write what is their uh latency expectations what would be their throughput or the data size and how it will grow in the next 2 years 3 years 5 years all of this information is collected based on that obviously when you let them configure everything it doesn't make sense again it adds a lot of con complexity on their end so we have different sizes smalls size database midsize database or a large size database so based on the information that they have entered we output that okay for based on your information you can go with a medium-sized infrastructure which can be a six node cassandra cluster for example if it's a large size it can be like a 30 node Cassandra cluster so abstracting all of the technical details also will help them they will say hey this is my requirement tell me how big of a cluster I need and they will go and select that and it deploys that big of a cluster and then another aspect of providing these systems as a service is working to integrate into the standard workflow and tooling that the teams that you're supporting are already using you mentioned that you were using Terraform at least initially for provisioning these setups not every team necessarily wants to get up to speed with Terraform they just want to be able to say just click a button the CI runs and everything's great and I'm wondering how you thought about managing the interfaces that you were exposing to these different teams to be able to provision the resources that they need for their use cases so they don't necessarily have to write terapform that's the beauty of it right so earlier they were writing terraform to create a Cassandra cluster and all of that now we write the terraform they have a JSON call with set number of parameters like the cluster size or you know what kind of database technology and specific parameters that they want to tweak so that so even writing terapform or anything like that is abstracted so when we move to the you know the bigger phase two of the platform with service catalog where we are using service catalog and cloud formation they don't have to even write any code they have two options they can go to service catalog UI and select I need a Cassandra database six nodes these parameters I want to tweak or they have another option of we built an API on top of service catalog where they will just call the API and that they can integrate in their workflow and anytime they are using uh repetitive testing environments where they build and destroy the clusters they can put it as part of their code and they build the cluster they do the testing the code or they productionalize it if it is a production cluster they call it one time and that infrastructure is available obviously in order to be able to provide these different database engines as a service it requires a decent amount of familiarity with their operational characteristics the scaling considerations the ways that you need to manage orchestration of the nodes entering or leaving the cluster fine-tuning of the throughput and that requires a lot of effort and time investment and usually a lot of uh a lot of errors that come up as a result i'm wondering if you can talk through some of the ways that you helped to build that operational familiarity and operational comfort of the different engines and the ways that you selected the engines that you wanted to support as part of that core offering yes so uh again so when we are building the platform the advantage is like let's say a company has 20 different teams and they need at least like you know 10 different Cassandra if they're running Cassandra they need 10 different DBAs in integrated in their own part that's the typical DevOps model now if we are bringing them all to the platform team we only need like two or three who are experts so they set up everything the configuration and everything and everybody else uses it so we we don't need every team expert embedded in their teams so the platform team abstracts all of that and once you have that best practices are defined this is how you need to create secondary indexes or like this is how you need to build your access patterns they're all defined and uh uh as I said when we are involved from the day one like let's say there is a use case and day one we can help them make sure that they don't make costly mistakes which will hurt later but adert to those access patterns and the best practices and that will help us for successful uh use case deployment And so talking a bit more about the implementation of the platform management you mentioned that you were using the AWS service catalog how did that help to enable the scaling of your platform offering and some of the other technologies that maybe you tried and failed with before you got to that solution yes so uh like initially as I said we were using terraform to manage four different accounts but uh with 400 accounts it was not an option so we tried few things like uh you know entirely coding a different platform with this hubspoke model and things like that but that's when when we started doing research we saw that this is a readily available option service catalog is not just for data platform or anything any product if you have an infrastructure as a code that you want to deploy you can use service catalog it can be an EC2 instance to a network configuration to a database so when we started researching we came to know about service catalog so we did a quick P and it just took off like the when we saw that okay we were able to manage 400 different accounts and create database clusters within those accounts from a central place with one set of uh template that gave us the power uh to quickly you know onboard all of these brands to one central platform and manage them so yeah this is a pharmaceutical ad for soda data quality do you suffer from chronic dashboard distrust are broken pipelines and silent schema changes wreaking havoc on your analytics you may be experiencing symptoms of undiagnosed data quality syndrome also known as UDQS ask your data team about soda with Soda Metrics observability you can track the health of your KPIs and metrics across the business automatically detecting anomalies before your CEO does it's 70% more accurate than industry benchmarks and the fastest in the category analyzing 1.1 billion rows in just 64 seconds and with collaborative data contracts engineers and business can finally agree on what done looks like so you can stop fighting over column names and start trusting your data again whether you're a data engineer analytics lead or just someone who cries when a dashboard flatlines Soda may be right for you side effects of implementing Soda may include increased trust in your metrics reduced late night Slack emergencies spontaneous high- fives across departments fewer meetings and less back and forth of business stakeholders and in rare cases a newfound love of data sign up today to get a chance to win a 1,000 plus dollar custom keyboard visit dataengineeringpodcast.com/soda to sign up and follow Soda's launch week which starts on June 9th and then another element of offering databases as a service is the consistency that you're offering in terms of the setup the scalability but also there's the security model and the requirements that exist around different types of data that you're working with the ways that data can be propagated and moved between different systems and I'm curious how you worked through some of those elements of governance and setting expectations and requirements with the numerous teams that you were supporting and the organizational buyin that you had to get as you started implementing those various controls and constraints yes so we we were focused on data platform the data governance aspect like a GDPR or like data scrubbing we have a separate or just focusing on that but uh basic data security like authentication uh SSL on transit and SSL at rest we can control that now because we are not leaving it to the de developer to make sure that he's securing the data once it is in the platform all of these are guaranteed they are all free so encryption at test encryption in transit authentication logging audit all of that is enabled and it's all part of the best practices and in working with the different teams beyond just the core security elements what were some of the challenges that they ran into as far as maybe they were running into limitations or various security constraints that you were enforcing and some of the ways that you worked to familiarize them and and and just document the standard practices and expectations that they would need to conform to to be able to use the platform that you are offering yeah so obviously there is a friction like uh when you are already running a database when we go and ask them to onboard to this platform if it is a like to like onboarding like if you're using a one particular database and you're just going to the same database on the platform it's pretty straightforward because you join to that cluster stream the data and remove the old infrastructure but at the same time we took the opportunity to modernize and upgrade some of update some of the database choices as well like moving from SQL server to Aurora or like for example sometimes MongoDB to document DB for various reasons so that requires a lot of developer buyin because some form of uh rewriting the code is involved but uh again it is your organizational goals if they're aligned to your organizational goals you can get enough traction for that and we will able to move the biggest challenge we have faced is like uh I would say developers worrying about losing control now if I have a performance issue I can double the box and I don't have to talk to anybody but if you're moving to the platform there is an additional layer that was a bigger culture change that we had to bring within the company that hey you know we are not somebody that is stopping you to do good stuff we are trying to accelerate you to do that good stuff we are not like you know the way I tell everybody is like hey we are just extension of your team taking care of a specific aspect of it we are just your team focusing on this so you don't have to worry about that really helped us and obviously a lot of communication collaboration and the team culture aspect of it a lot of times platforms bring silos even without realizing but if you implement it right with great collaboration and building it with the interest of developers accelerating it really works and then from that migration perspective where maybe you have a team that's using an existing database and they need to move to a different engine or even if it's the same engine obviously there are various uptime considerations and various challenges that exist in any migration particularly if you're moving from an unconstrained or an unopinionated approach to a more opinionated and constrained system and I'm curious how you worked through some of those challenges of doing that data migration particularly with considerations around uptime that has still been a bigger challenge why we still run a lot of legacy uh relational SQL stuff is because of the same thing how do we migrate to more modern databases without uh downtime and also you know how do we refactor all the code which is there is a code which is 15 years old nobody wants to touch it honestly like who wrote is gone who wrote that code is gone people who are here they're happy as long as it runs and they don't know what to do if it breaks so that is still the a challenge that we are still holding us back on migrating the last portion of some of the stuff to the platform but mostly it's like uh working with the knock team and setting up we have monthly uh windows where we control how much downtime it's we make sure that we do everything three-steps and everything and most of the time it's just a flip okay the cluster will run in an expanded mode for a long time and when we are ready we take the 2 minutes downtime and flip it in case but most of the time the NoSQL databases like Cassandra and Mongos they give us the flexibility to do all of this migration without any downtime in terms of the workloads and you mentioned the relational engines there are definitely a number of legitimate uses is for relational systems they're not necessarily always adaptable to a NoSQL use case i'm wondering what your overall approach is as far as the relevance and utility of NoSQL compared to relational engines and maybe some of the challenges of scale that you're experiencing with those relational engines that will maybe motivate you to doing something that is more of a NoSQL flavor so uh again as I said we are not coming up with a pre idea of this is the database you need to use there is a very valid use case still to use relational and we still do a lot of our workloads run on Aurora and Postgress and MySQL all of that what we are trying to do is migrate more from a SQL Oracle kind of on-pre setups to more cloudnative relational databases so we still will continue to use heavily relational databases but in a more cloudnative relational uh setups we don't want to move every use case to no SQL transactions bookings they're all critical they need strict asset properties and things like that so we will continue to use those on relational databases what we want to do is use the specific tool for the specific use case uh also how to uh run it with the least cost like if there is an open source databases database out there which can solve my use case without paying any license cost we want to use that instead of paying heavy licenses or overhead cost around this legacy databases so one of the other things that you mentioned earlier was the shift from a DevOps style approach to more of a platform engineering approach where to begin with you had these embedded experts who worked with all the different teams but maybe didn't have as distributed of a set of knowledge around the different database engines and now you've consolidated a lot of that expertise into your platform team with a focus on the data layer and I'm wondering how some of the ways that the lessons learned in that has maybe translated related to other operational elements of your overall engineering stack to invest more in a platform versus DevOps style approach and some of the ways that those DevOps style resources have either shifted focus or some of the other types of work that they're still engaged in within those teams in an embedded fashion yeah so one thing is like platform engineering is not a replacement of DevOps it's an enhancement of DevOps like instead of you know DevOps team taking care of 100 different things they just now take care of 10 different things and uh let the platform teams worry about that so when we brought all of these engineers to a like data database engineers to a central team and built that platform that really worked it really freed up a lot of time for DevOps people to because let's say you are a Java engineer or a CI/CD engineer you may not know all aspects of networking or database or you know how to set up uh 10 different services in AWS that will free them with uh you know fatigue of learning 100 things to make one thing work and make free up a lot of their time so that they can focus on what they are good at the lessons learned is like obviously as I said like this is not a one-sizefits-all we don't want to build a platform and enforce on every every organization in the company the way we want to do it is reverse we want to go and understand what are the current challenges come and try to fix that using a platform that's the biggest lesson learned like anytime that we are building other platforms or even in the future I build a platform the fundamental concept is you don't start a platform team to build a your own for your own org it's like identify a challenge in the organization and help them solve it not to have like 20 different reports added to you another aspect of having that platform team and that more centralized capability as you mentioned is it can be more scalable and more efficient but it also requires organizational support to invest in that dedicated resource versus having everybody be more generalized and the expectation that everybody manages their own requirements and I'm wondering how you've seen some of the organizational investment and some of the ways that you've had to work with the broader organization to help promote the utility of having that centralized investment and centralized resource yeah definitely we have to start small you can't just hire like 20 different people and say I'm starting a platform team but you start small and uh show the value that's what we did we just did it in one or where we started this platform concept and added value showed the value in terms of monetary benefit in terms of quality like if you have a generalist in 20 different teams versus a you know two or three specialists that adds a lot of value especially when you are having an incident when you're running an open source databases and you are having an incident if you have a generalist the probability of quickly solving it is less compared to when you have when you have an expert so when we did the small setup for an almost 18 months and we showed the company that this is the value ad that's when we started you know onboarding to different organizations and different brands and all of that it took almost like two two and a half years to onboard all of the company because they were seeing value and like one dev team started using the platform they were like okay these are all the values that I'm getting obviously they all talk in the dev communities and that's how word of mouth we never enforced it in the company that's the beauty like we never went and said that you need to onboard to platform by so and so date we said this is available if you onboard these are your benefits and that's how it has grown organically in your experience of building that capacity working with the engineering teams and helping to promote your approach in various conference presentations and talking to other organizations What are some of the most interesting or innovative or unexpected ways that you've seen that platform of capacity applied either within your own team and organization or as it translates to other teams and uh companies i I initially thought this multi- account problem was specific to us because of the huge number of brands and accounts that we had but once I started going out and talking in various conferences it's like it's a lot lot of large companies have the same problem because uh I think at one point AWS starts you to add more and more accounts because they start uh hitting limitations uh IP limitations and resource limitations and things like that at least now they have increased their uh caps a lot but like five six years back we ran into IP limitations in lot of accounts like we were not able to provision new EC2 instances and things like that so that's when they said hey build a new account if you're a new team build a new account that's how we ended up with like you know hundreds of accounts and a lot of companies a lot of major companies that I spoke to have the same problem and they started looking into the similar HubSpoke model and uh centrally managing infrastructure which has gave them a lot of good results as well and in your own work of building out this capacity investing in the engineering effort and the management effort required to bring the company along and see engineering success from it what are some of the most interesting or unexpected or challenging lessons that you learned personally i would say like uh you know we should always focus on automating things this was true five years back but this is more true with AI and ML uh you know catching up now but uh the biggest advantage that we saw from the operational aspect is uh once we had this platform we thought about hey uh you know all everybody in the team if you are repeating this task manually for more than three or four times what can we do to automate it so we need to have uh having that auto uh building automation around a lot of things has really helped us so if I take if I can take like few minutes to explain this multiple things okay initi okay we had this infrastructure that is good we had monitoring everything setup now uh my engineers are spending at least four or five requests every week on scaling clusters then we start how can we automate it so we started building you know scalable infrastructure where click of a button I can go from six nodes to 60 nodes and come back you know whenever the marketing events and things like that are done instead of manually doing everything that we were Every week we were thinking hey what can we automate this week what can we automate next week that way we automated scaling we automated uh you know incident resolution that is one of the great things that we did where let's say you are getting disk space alerts okay if you page me at 2:00 in the night that disk is filling up what we will do you'll go and expand the EBS so we thought why can't we automate it so we used uh AWS event bridge and other services and built automation around this incident like if I get a page The bot will get the page first and the bot will see okay this is a similar pattern of a page that I can automate and then based on the automation scripts that we provide it will go and fix that page like expand the EBS volume or if a node is down if a on 60 node cluster like a sander node is down to whatever reason you can just immediately bring it up watch for 5 minutes and if everything is okay automatically close the page instead of paging a real person so by implementing this automation mindset for everything from incident resolution to scaling to uh you know all the aspects of management we were able to you know save a lot like we reduced 40% of our pages today at least 40% of our pages are handled by bot instead of a real DBA obviously we'll have a report next day and see okay these are the pages that the bot handled what we do to enhance it more so traditionally DBAs were more setting up everything traditional mindset with manually and all of that but uh I think right now even the DBS they need to focus on learning Python they need to focus on learning different a IML trends and analyzing what they can automate and reduce the manual workload that is the biggest thing that I have learned over over the years like automation first and also knowing when automation is not feasible and you need to just get something done because there's the famous XKCD comic of theuh amount of time that you think it's going to take to automate something and then the amount of time it would take to just do it manually and they're not always in line with each other yeah yeah definitely that's why like every week we think hey are these repetitive tasks more than three times that you are doing so if it is a one-off thing that's no point definitely but if you think that okay this is something that I will use four more times definitely have uh a script around it and so for people who are looking to invest in their own platform capacity or they're looking to invest more into non- relational engines what are the cases where you would advise against either or both of those approaches uh if you're a small company with you know set number of teams I don't think investing in a platform helps like just have your DevOps engineers focus on best practices and things like that uh but if you are a large company with uh and also like if you're not building infrastructure every day okay let's say it's a I set up infrastructure for the next 6 months I build my use case I'm going to the market and it's sustainable it's okay you don't need to build a platform but if you're constantly creating infrastructure and scaling and uh you know evolving and you are a large or at least a midsize company I would say you have to invest in a platform mindset but if it's you're a very small company a startup I I think it's an overhead and I think the same thing with the NoSQL technologies as well right I mean the advantage you have if you are starting something new is you have an option like okay you pick you go and pick the best thing for it but if you are a legacy company or still in a lot of tech on old legacy stuff uh you need to obviously do the return of investment analysis on what it takes to migrate and does it add any additional value and as you continue to invest in these automation capabilities managing the scale and variety of offerings that you are supporting what are some of the things you have planned for the near to medium-term or any new technologies or new capabilities that you're looking to invest in uh right now we are doing a lot of work on uh AML stuff around data infrastructure as well where we can if we have an incident okay it has to go and scrape all the previous incidents come up with an action plan and tell me okay this is your problem it happened this day this is what you did quickly fix it uh that's one example and uh automated scaling uh where right now we go and let's say we have a marketing event and things like that we say expand the cluster from six to 60 nodes the process of automation uh expansion or the scaling of the cluster is all automated but the decision to automate is still manual so we want to also use a IML to make that decision whenever it's seeing some particular trends it can autoscale and you know come back to the original capacity later we are also investing a lot on in in addition to the data infrastructure you also want to expand it to the the app layer where you have a use case you need like 10 app servers or like 10 data infrastructure servers and all the aspect of it we want to tie everything into a pod instead of just the data uh portion of it are there any other aspects of the work that you're doing the overall approach of platform engineering for building these database as a service capabilities or the specific engineering challenges that you've been tackling in the process of building out that capacity that we didn't discuss yet that you'd like to cover before we close out the show few things is like I think I want to emphasize again that uh you know platform engineering is not a replacement for DevOps uh it's mostly to enhance and amplify the DevOps culture and if done right gives a lot of uh self-service capabilities to the developers it reduces their complexity uh standardizes a lot of tools and workflows within the company so it's a great investment if you're a midsize or a large size company all right well for anybody who wants to get in touch with you and follow along with the work that you're doing I'll have you add your preferred contact information to the show notes and as the final question I'd like to get your perspective on what you see as being the biggest gap in the tooling or technology that's available for data management today i think there is a difficulty with real-time enforcement like as the data moves fast especially with streaming the ability to govern and manage it in real time becomes really crucial so many traditional and uh many traditional governance and management tools are not built for that speed so uh that's a big gap for as as for me it's like u real time enforcement of data with governance and everything on streaming side is a gap still a gap all right well thank you very much for taking the time today to join me and share the work that you've done and your of building out this platform capability for databases as a service at your organization and some of the ways that you've addressed those challenges of scale and organizational buyin it's definitely a very interesting problem space and definitely an important one as you scale your capabilities and scale the organizational complexity so I appreciate the time and energy you're putting into that and for you sharing your insights and I hope you enjoy the rest of your day pleasure is mine thanks [Music] thank you for listening and don't forget to check out our other shows podcast.net covers the Python language its community and the innovative ways it is being used and the AI engineering podcast is your guide to the fastmoving world of building AI systems visit the site to subscribe to the show sign up for the mailing list and read the show notes and if you've learned something or tried out a project from the show then tell us about it email hosts at dataengineeringpodcast.com with your story and to help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers [Music]

💡 Tap the highlighted words to see definitions and examples

Ana Kelimeler (CEFR C1)

engineeringpodcast

A B2-level word commonly used in this context.

Example:

"migration into weeks visit data engineeringpodcast.com/datafolds today for the details your host is Tobias Macy"

availability

The quality of being available.

Example:

"the cap theorem what is important to you is it like consistency or availability"

mechanical

Manually created layout of artwork that is camera ready for photographic reproduction.

Example:

"custom mechanical keyboard visit dataengineeringpodcast.com/soda to sign up and follow Soda's launch week which"

schedules

A slip of paper; a short note.

Example:

"and setting up schedules we have monthly uh maintenance windows where we control"

maintenance

Actions performed to keep some machine or system functioning or in service.

Example:

"and setting up schedules we have monthly uh maintenance windows where we control"

datafold's

A B1-level word commonly used in this context.

Example:

"datafold's AI powered migration agent changes all that their unique combination of AI code translation and"

year-long

Lasting one year; of a timespan of one year.

Example:

"they're so confident in their solution they'll actually guarantee your timeline in writing rated to turn your year-long"

introducing

(of people) To cause (someone) to be acquainted (with someone else).

Example:

"offerings and being able to provide databases as a service at scale so Chakravarti can you start by introducing"

experiences

The effect upon the judgment or feelings produced by any event, whether witnessed or participated in; personal and direct impressions as contrasted with description or fancies; personal acquaintance; actual enjoyment or suffering.

Example:

"experiences of building out this platform capability for databases as a"

architectures

The art and science of designing and managing the construction of buildings and other structures, particularly if they are well proportioned and decorated.

Example:

"platform for a leading online travel company and I have nearly two decades of experience with scalable architectures"

Kelime	CEFR	Tanım	Örnek
engineeringpodcast	B2	A B2-level word commonly used in this context.	"migration into weeks visit data engineeringpodcast.com/datafolds today for the details your host is Tobias Macy"
availability	B2	The quality of being available.	"the cap theorem what is important to you is it like consistency or availability"
mechanical	B2	Manually created layout of artwork that is camera ready for photographic reproduction.	"custom mechanical keyboard visit dataengineeringpodcast.com/soda to sign up and follow Soda's launch week which"
schedules	B1	A slip of paper; a short note.	"and setting up schedules we have monthly uh maintenance windows where we control"
maintenance	B2	Actions performed to keep some machine or system functioning or in service.	"and setting up schedules we have monthly uh maintenance windows where we control"
datafold's	B1	A B1-level word commonly used in this context.	"datafold's AI powered migration agent changes all that their unique combination of AI code translation and"
year-long	B1	Lasting one year; of a timespan of one year.	"they're so confident in their solution they'll actually guarantee your timeline in writing rated to turn your year-long"
introducing	B2	(of people) To cause (someone) to be acquainted (with someone else).	"offerings and being able to provide databases as a service at scale so Chakravarti can you start by introducing"
experiences	B2	The effect upon the judgment or feelings produced by any event, whether witnessed or participated in; personal and direct impressions as contrasted with description or fancies; personal acquaintance; actual enjoyment or suffering.	"experiences of building out this platform capability for databases as a"
architectures	B2	The art and science of designing and managing the construction of buildings and other structures, particularly if they are well proportioned and decorated.	"platform for a leading online travel company and I have nearly two decades of experience with scalable architectures"

Daha fazla YouTube dikte egzersizi mi istiyorsunuz? Ziyaret edin pratik merkezi.

Birden fazla dil çevirmek istiyor musunuz? Ziyaret edinWant to translate multiple languages at once? Visit our Çok Dilli Çevirmen.

Dikte için Dilbilgisi & Telaffuz İpuçları

Chunking

Anlamayı kolaylaştırmak için konuşmacının cümle gruplarından sonra duraklamasına dikkat edin.

Linking

Kelimeler birleşirken bağlantılara kulak verin.

Intonation

Önemli bilgileri vurgulamak için tonlamadaki değişiklikleri takip edin.

Video Zorluk Analizi & İstatistikler

Kategori

basic

CEFR Düzeyi

Süre

Toplam Kelime

7462

Toplam Cümle

408

Ortalama Cümle Uzunluğu

18 kelime

İndirilebilir Dikte Kaynakları

Download Study Materials

Download these resources to practice offline. The transcript helps with reading comprehension, SRT subtitles work with video players, and the vocabulary list is perfect for flashcard apps.

Ready to practice?

Start your dictation practice now with this video and improve your English listening skills.

Alıştırmaya Başla

Connect

Scaling Data Operations With Platform Engineering – YouTube Dictation Transcript & Vocabulary

Etkileşimli Transkript & Vurgular

Ana Kelimeler (CEFR C1)

engineeringpodcast

availability

mechanical

schedules

maintenance

datafold's

year-long

introducing

experiences

architectures

Dikte için Dilbilgisi & Telaffuz İpuçları

Chunking

Linking

Intonation

Video Zorluk Analizi & İstatistikler

İndirilebilir Dikte Kaynakları

Transkript (.txt)

Altyazı (.srt)

Kelime (.csv)

Download Study Materials

Ready to practice?

Paylaş: