Q2 2022 Palantir Technologies Inc Earnings Call

August 8, 2022

B G.

Foundry Nexus peering met.

<unk> constellation and Apollo all of which were built before their time all of which have made of <unk>.

41% CAGR possible.

These products should be measured not just in their ability to throw off free cash flow to generate outsized revenue, but most importantly in there.

Their quintessential attribute that large company.

Companies, which essentially control distribution cannot easily copy them or if at all there to integrate difficult and thick to be replaced by larger incumbents and then distributed through their distribution chain and the and all software products actually have to be measured by is this replaces.

<unk>, how easy could abate replaced is the underlying platform durable or.

Fleeting and current occurred a third party highly technical with massive distribution disrupt these products. If you look closely at pega.

Foundry Nexus peering meta constellation and Apollo is different as they are they have one thing in common it would take many many years of the world's best engineers to build them and it would take you would be building them as we've improved them and as we capture the market.

Thank you.

Thank you Alex.

We have the great privilege of being on the forefront of the problems that matter most in the world from the war in Ukraine to fighting famine, and monkey pox across government and commercial the opportunity in front of US is enormous which makes the revised near term outlook all the more disappointing.

It doesn't come close to representing our ambition and the opportunity before us.

While the timing of large contracts in government can be frustrating the underlying requirements and needs are and dairy.

It's worth noting that our revised guidance excludes any new major U S Government awards.

At the same time, we have seen the opportunity presented by this environment before.

As organizations around the world face more pressure and experienced more pain, there will be a slowdown in the rate of spending and lengthening of sales cycles.

It will also reveal gaps in enterprises operations gaps are software can solve in.

In the short term this means less revenue now, but on longer time horizons it accelerates our business.

The global financial crisis.

The tax in Europe , the Covid pandemic.

Through each opdivo, we emerged substantially stronger by investing in our customers ahead of revenue and delivering results in days not months.

It is exactly in times like these that we build our most important and most impactful partnerships. Each of these periods has been an inflection point for Poland here.

A time during which we developed path breaking software platforms and expanded our footprint.

To our results.

We generated $473 million in revenue in Q2, 2022, representing a growth rate of 26% year over year and 6% sequential.

Our customer count increased to 304 up from 169 a year ago.

Our business in the United States alone generated more than 1 billion on a trailing 12 month basis, representing 42% growth.

I'll now hand, it over to Sean.

Thanks, Ryan our revenue growth in the U S. Commercial market continues to be our focus and is gathering momentum.

Our U S commercial business grew 120% rising from $39 million in Q2, 2000 $21 million to $86 million. This quarter, our core U S commercial business, which excludes strategic investments grew 10% quarter over quarter or greater for the past four quarters, most recently, reaching $67 4 million, 14% sequential growth.

Our core U S. Commercial ACB grew 2.4 act, our best quarter ever.

Our U S commercial customer count grew from 34 to 119 customers year over year.

Our overall commercial revenue grew 46% and continued to increase for the ninth quarter in a row on a quarter over quarter basis, reaching $210 million in Q2.

We are seeing former customers, particularly those in the U S, including some of the world's largest transportation banking and retail enterprises returned to our platform in increasing numbers after periods of experimentation with other platforms and approaches. These customers. They are not returning to our software platforms merely because of the expansion of our sales operations, which does continue there often are returned.

Because they have tried other options and those options have failed to deliver needed results.

Our commercial business in Europe , and elsewhere outside the United States continues to expand as well, although the exceptional strength of the U S. Dollar has been a factor given the significantly increased effective cost of goods.

Our government work remains at the center of everything that we've built in everything that motivates us.

The expansion of our reach with agencies working on the Frontlines and conflicts around the world, including in Eastern Europe , and the South China Sea continues the contracts we have in our pipeline are key to facing the challenges in front of the west.

Our government business grew to $263 million in revenue in Q2, rising 9% quarter over quarter.

More broadly across the commercial and government sectors health care has become a substantial and rapidly growing business generating approximately $153 million in revenue in the first half of 2022 up from 42 million in the first half of 2020 at the onset of the pandemic that represented 91% compounded annual growth rate.

We see substantial opportunities for powder is a healthcare technology company driving transformation across major pharmaceutical companies biotechnology companies insurers providers regulatory agencies and research organization building on our work with organizations such as the Uk's NHS U S. HHS the CDC the FDA the NIH Sanofi.

<unk>, Merck and <unk> health care business in Japan.

Our software products and platforms continue to operate behind the scenes in connection with some of the most significant events around the world, including the distribution of vaccines to millions and an ongoing war in eastern Europe .

One of our most recent offerings within foundry a no code application development environment known as workshop that allows users with little or no coding experience to build operational applications on top of data warehouses within minutes is showing particularly strong growth with more than 10000 developers now building applications within the platform.

And our most recent offering pipeline builder brings the same no code approach to author and data pipelines, turning every business analysts into a production data engineer in foundry.

We just launched an experimental feature that Leverages open AI is GPT three to turn natural human language into pipeline logic and it's like find me all the hospitals that have limited ICU bed availability in the next three weeks is transformed into our pipeline and second a new standard and no code.

Forrester named foundry the leader in AI platform as part of the Forrester wave AI ml platforms Q3 report foundry.

Foundry is foundry operating system received the highest possible scores in the product vision performance market approach and applications criteria.

Turning to Gotham urgent operational needs are driving innovation in our integrated hardware software offerings building on products like Titan Sky Kit will combine <unk> met a constellation software on a small human portable form factor with Starlink comms to enable heroes in the field to task low latency AI driven satellite collection.

Our focus in the short term remains on making our three principal software platforms, Gotham foundry and Apollo available to broader segments of the market with unbeatable time to value.

I'll turn it over to Dave to take us through the financials.

Thanks, Shawn as we've highlighted our U S business is remarkably strong.

U S revenue grew 45% year over year to $290 million and on a trailing 12 month basis U S revenue grew to $1 four.

<unk> 4 billion.

U S commercial revenue grew 120% year over year to $86 million, our core U S commercial revenue, which excludes our strategic investment program grew 14% sequentially.

On the customer side, our net new U S commercial customers grew 16% sequentially.

U S government revenue increased 27% versus the year ago period to $205 million up from a 16% year over year increase in the first quarter.

Turning to our global top line results second quarter total revenue grew 26% year over year ahead of our prior guidance to $473 million.

Our overall net dollar retention was 119%.

Commercial revenue increased 46% year over year to $210 million.

Government revenue increased 13% versus the year ago period to $263 million.

Our global customer acquisition remains strong.

We added 27 net new customers in the second quarter, bringing our Q2 2022 customer count to 304, an 80% increase year over year.

We added 19, net new commercial customers, which represents a 157% growth year over year.

Our growth was existing customers also continues to remain strong trailing 12 month revenue from our top 20 customers increased 17% year over year to $46 million.

Second quarter billings were $396 million up 5% year over year.

In the second quarter, GCB booked with $792 million.

<unk> booked was $588 million.

As Alex mentioned, we saw that large new USG contract awards have been pushed out.

While government bookings grew quarter over quarter, increasing 128% sequentially. This was primarily driven by renewals.

We ended the second quarter with $3 5 billion in total remaining deal value roughly flat quarter over quarter. Despite the excellent GCB quarter total remaining deal value was impacted primarily by two things one we voluntarily terminated several contracts related to investment commitments that we decided to not move forward with.

Two.

Our TCE number included the successful conversion of option years from a U S commercial customer contract that are resulting in lower overall deal value secured commitment for additional years.

We ended the second quarter with $1 2 billion in remaining performance obligations up 79% year over year.

As a reminder, <unk> is primarily comprised of our commercial business.

As it does not take into account contracts with an initial term of less than 12 months and contractual obligations that fall beyond termination for convenience causes both of which are common in our government business.

Turning to our margins and expense adjusted gross margin, which excludes stock based compensation expense was 81%.

Second quarter adjusted income from operations, excluding stock based compensation and related employer payroll taxes was $108 million, representing adjusted operating margin of 23% ahead of our prior guidance of 20%.

Second quarter, adjusted expenses were $365 million up 11% sequentially.

Second quarter adjusted earnings per share was negative <unk> <unk>.

Which includes a negative <unk> <unk> impact driven primarily by losses on marketable securities.

We generated $62 million in cash from operations and our adjusted free cash flow was $61 million, representing margin of 13% and our seventh consecutive quarter of positive free cash flow on an adjusted basis.

On a trailing 12 month basis, we have generated $314 million and adjusted free cash flow.

We ended the second quarter with $2 4 billion in cash and cash equivalents and no debt.

In July we expanded our revolving credit facility, adding a 450 million new incremental delayed draw term loan facility, which provides for additional liquidity up to $950 million and remains entirely undrawn.

As we highlighted last quarter, our balance sheet leaves us uniquely positioned to take advantage of the opportunities that may arise from the larger macroeconomic environment.

Now turning to our outlook, we are presently guiding to the third quarter and full year 2022.

For the third quarter, we expect revenue of between 474 and $475 million and adjusted operating income of $54 million to $55 million.

For full year 2022, we now expect revenue of between $1 nine and $1 902 billion and adjusted operating income of $341 million to $343 million.

This revised guidance excludes any new major U S Government awards and we believe this to be the base case.

With that I'll turn it over to Anna to start the Q&A.

Thanks, Dave Our first question comes from Martin, who asks when are you expecting to become profitable.

Thank you for your question.

Yes, we driven by.

Our five products and by our ability to control costs, mainly because we left Palo Alto, which will have primarily for political reasons, but of course power to somewhat inelastic luxury product meaning.

You must pay and pay and pay for the comfort of having oddly misaligned with American interests political views.

<unk>.

In general high cost.

That combination of <unk>.

A radical option of Optionality on expenses.

And what we perceive to be a macro conditions can converging with product conditions.

<unk>.

Allow us to kind of see what we think will be a profitable company.

2025.

Great. Thanks, Alex Our next question comes from Jose who asks what is your 10 year plan for the future.

<unk>.

We built.

Five products.

Those who follow us know them well, those who don't follow us know them well because they've influenced your life, especially PGE, our anti terror product.

<unk>, which is responsible for any vaccination you had if you are listening to this.

If youre listening this from America and Britain, our 2000 and some other countries.

Those of you who are following the war no some of our other products even at a distance and a number of other products. We are continuing to deploy those products by the way as you may have noticed we've now crossed the $1 billion Mark a second time, when we <unk>. There was a real question could pound share crossed the billion dollar software Mark This is a massive market for.

Enterprise software we've now crossed this in America, we will cross that for all five of our products, obviously foundry will be a multibillion dollar product.

We will do this in the service of the West, which we will fight to help win rejuvenize institutions. The way we have in the past on numerous.

On numerous fronts, we will use the distribution network, we have built two.

Provide them these institutions with new and better products Apollo will be a multibillion dollar product as a standalone product our company will be as of the next two years of 70% dollar revenue company.

And we will preserve our culture.

And we are going to do this.

Essentially which is a very long term focus do not expect us to be like another company, we will make aspirational goals I think there's a lot we can learn by the way on modeling these goals.

But we believe we're the best in the World of software for the West.

And with the attributes and work so that we will push forward to make the west stronger and better and build.

Multibillion dollar company.

Top of each one of our single products by the way, we will build new products, we're already working on a number of products.

We have products like nephews peering that are not apparent to the world that power and very important things and we'll continue to build those kind of products as well.

Thank you Alex our next question is from John who asks what are you doing differently from your competition that is allowing you to be successful.

What we're doing I mean, we're not we're pretty orthogonal to our competition. So I wouldn't even like and by the way I would say, we're less competitive with our competition then.

We have a lot of people compete with US most of the companies that kind of built for wall Street, which have <unk>.

Perfect financials and have perfect.

<unk> of what Theyre going to hit in very very thin products sales forces that are 50% of their total people the sales and use in the U S commercial which are by.

By any standard anonymously strong almost 80% growth this year without specs total organic revenue is being driven by 42 salespeople.

Less than one 5% of our company.

Or is it credit we have more but those are the ones that are actually learned a product and are able to sell it. This is a company we.

And because of there is some real features we built this product the product is obviously ahead of the market. We believe the growth in the U S.

You can draw a trend line of this is doubling now again, we're close without specs you could see how this could double or be in that range again next year and again, we're doing this in a completely unique way.

And so then I would also say as like compared competition. One of the things that is just not paid attention to at all by normal normal investing company is the very basic fact or is this a product that will be in some ways replicated by the big companies Wonder if truly interesting great companies.

But that will basically takes them products and get them into our offering and have better distribution and in fact in the end the products are over a long period of time relatively worse, let's say you get a high growth built on back of.

Selling equity and going into taking equity and buying a sales force on selling a thin product, which is no longer going to be around in a couple of years, either because it will not be useful or because of big company will replicated in shell, but basically the same product or products are not like that nobody is really endeavoring to take to do a version of.

PGA or in excess peering, which people don't understand which is powering defense effort globally foundry is very thick and is essentially hundreds of products integrated it can be and deployable quickly you have Apollo. These are very very unique products and so what I would say on the competition thing is it is surprising.

How.

Different we are from other people in broadly speaking the same space and that has some advantages.

We see enormous traction on our products.

You're also getting this at a price because we run this company as owners and.

And we do not run it purely to actually make people happy quarter to quarter and I honestly don't pay a lot of attention to that I'm paying attention to where will the company be in two years.

And what will the products be like in deployment deployment will they be adopted who's building them. How are they going to be built and are they in fact and fair to the market. So we have this very unique hybrid do you have to look at it as a unique company and assess it as what feature set do I like what bug set do I not like.

Great. Thanks, Alex.

I'll turn to you Shaun for this next one before we open up the call Christopher.

Christopher asks can you explain to <unk> shareholders, but becoming the sixth prime for the U S government in the first half to our prime would mean for the company.

Also what sort of timeline are you targeting to become the first software prime as the U S government.

Thanks, Christopher our ambition in the U S government and by extension all of government is to be the sixth prime contractor and that means that we are the trusted partner capable of delivering end to end platforms and programs that we're willing to invest deeply in the specific needs and requirements for these customers but.

But we just want to do this is the first software prime and we already are the way that we think about our products and government is as offerings that are built with our platform. They are built with Gotham with foundry and Apollo the products would be for example at the U S Army vantage tightened CD, one CTG massive programs of record or operation Warped speeds Tiberius.

That's their vaccine management platform that was built with foundry decipher at the CDC project Brown Heron at the Air Force <unk> at the space Force it into a pay comments the mission partner environment.

Titan will deliver a next generation expeditionary scalable and maneuverable platform that is purpose built to address the Army's number one gap in large scale combat operations deep sensing in simpler terms as an armored truck that connects the satellite in theater assets and we are offering an end to end integrated hardware software solution here. This is exactly.

What we're doing with Sky kit fusing meta constellation Starlink, a purpose built compute platform to enable warfighters detached satellites from the field and receive low latency AI driven detections to drive kinetic operations look no further than the CCP as intimidation exercise in Taiwan still ongoing we think this being the sixth prime is how.

We rebuild the Arsenal of democracy.

Thanks, Sean Our next question comes from Brian at Jefferies. Brent. Please turn on your camera and then you'll receive a proud Chinese airline.

Thank you good morning on the government side I am curious if you can address what youre seeing in some of the deals that you expected to close can you give us a sense of kind of timing.

On the government adequate follow up question for Alex.

You want to take this and then I'll do the gift.

Sure look the programs that we're going after these are enduring programs. The competition is ongoing the needs that we're servicing here are vital to the future of the west on both fronts of the potential conflicts.

We have revised guidance given the clarity that we have around when we think we're likely to get them.

And so we continue to invest in these things that pipeline around the government business continues to be really robust both domestically and internationally, but we also have more certainty around what is obviously a frustrating contracting experience. Let me give you a different riff on this.

A number that I don't know if I think we shared on one of our earnings calls was that.

Our government U S business has a CAGR over a decade or more of 35%.

During that time, we've had a number of years that were flat and this is frustrating believe me, it's more frustrating for us than anyone else, because we would prefer a even lower CAGR, but.

But having more certainty and so nevertheless, you can ask yourself the question.

Does it appear that the last 10 years, we're less dangerous or or or the next 10 years are going to be more or less dangerous in the last 10 years.

It is a very basic view that we have the next 10 years. The next few years are clearly more dangerous America's engaged on multiple fronts and then Theres. A question does pan once you have the product market fit and access to the market our product. We're looking at the U S business is going to cross the $1 billion Mark next.

Each year as well as is like so you have a $8 billion software business as of next year.

With positioning that has never been as good so both our micro positioning and obviously the macro positions. So sublime, it's hard to talk about without sounding like we're kind of we're monitoring.

And that's why.

I am positive internally and externally.

The growth in U S government over a multiyear period will be at least as good in the future as it was in the past however that 35% CAGR included a number of years, where it was flat or even negative and thats just the frustrating part about contracting at our level. The contracts are so big in meeting that you have.

Got to kind of wait.

And just a quick follow up when you think about the overall macro conditions that seem to be any number of other software companies you need discuss.

What youre seeing on the commercial side of any slowdown there or are you seeing this is Dave.

One of the things because.

Not exactly sure how or to put this but the typical way in which our company would interact with say analyst as we would have these precise goals and then we go after them and then when we when we're not going to get them. We would say things that we could have said on this call like 40, almost 40% of our business is outside of America every.

One of those contracts is being impacted by dollars. Obviously, it's not just that people are paying us in local currency. If you are paying us in dollars, which many of them are you've de facto paying 20, sometimes more percent more that we can capture on our balance sheet and can't show to you those things are impacting our business I would say the primary impact to.

Business, though is actually positive its U S adapts when things are bad and adapt very quickly.

I spent a lot of my life in Europe , Europe does not adapt as quickly and so what we're really seeing is America adapting to what we believe five years ago was the product of the future and buying it at like at a very anomalous way and beginning at scale. So we're not talking 50 to 102, and we're now talking $3 5400 to 650.

750, so we're seeing the AD the doom and gloom that is a blight on society. However, one wants to call it.

Both economic and political and quite frankly, the fact that legitimacy and our leaders is so embarrassingly low it's hard to solve problems that we're seeing that negatively impact the business in outside of America, particularly in Europe , because people are entrenched and slower on the Tech Tech acquisition side and very very positively.

Accelerating our business we are seeing it in commercial will believe in government.

And yes.

Great. Thank you Alex.

Our next question comes from Sanjay that Morgan Stanley Sandeep. Please turn on your camera and then you'll receive a product on your line.

Hi, Thank you for taking the questions.

I also wanted to get your view not so much on the quarter, but sort of the longer term framework I noticed that you guys didn't reiterate that 30% outlook and some sense that makes sense because.

Deals are uncertain, but I thought it was interesting that you guys to fact that 30%, which made it seem that some of the issues that youre seeing.

Okay.

In terms of contract I'm, sorry to.

You hear the question I was hoping to see you as well, but it's okay keep going I have I have the gist of the contract.

Got it.

I think in our ongoing business.

And thirdly, I am driving the company to get to $4 5 billion in 2025.

I believe in driving the company that way and I do not believe what I, what I believe is that the future on USG and by the way AIG is likely to be at parity with the last decade for both.

Macro reasons and because we simply have products that we built that were not deployed because they were basic theyre wertheim products. So you have.

Have you have medical installation Nexus peering and other products that de facto are not that useful unless you're at war.

Then you have the foundry uses of the use of foundry in the defense Department and in other debt that get usage, but not the way they should both for all sorts of issues that are also in civilian and then you're at Pega and fusing of PGA and in foundry was particularly useful but you have to understand it so I.

Aye.

I believe that we will get to that 2025 goal I tend to view the business the way I view, our most important segment of the business, which is there will be ups and downs again.

The 10 year CAGR on year, she is 35%.

Thanks, Alex Our next question comes from Mariano with Bank of America Marianna. Please turn on your camera and then you'll receive a prompt on mute your line.

Good morning, everyone.

Yes.

Lynn.

Correct.

No.

My question is a follow up on that U S. Army contract environment, you mentioned Eastgroup training I think most of the defense market.

The same however, what I'd like to hear from you.

Any change from the customer up relatively traditional approach to data rights given the urgency today to that.

Yes.

<unk>.

First of all.

Again, our frustration is built on the fact that we have a very large integral so it's like of course, we see smaller things accelerating but to grow off of from last year of 60, 70 base up until where we should grow requires very large contracts over many many years by the way are five in order to your question and not exactly what you're asking or.

Our part of our strategy is that we are going to grow commercial to be so big and so so linear and.

That especially starting in the U S that in fact, these vicissitudes or just less important what youre seeing in our business now is the largest part of our business is subject to contracting that actually is going to shift that the U S business I mean, even even without doing FX currency, which I have been told everybody on the plan is doing and I don't want to do.

Our European business is still growing at almost 20% that's without FX adjustment. So then if you look at our business, which continues to just in America again because of this adaptation in America, and a willingness to embrace things that didn't make sense two years ago that now makes sense now that you are looking at a business, where the vicissitudes you're just less important.

On the general political thing what can the west still of course, I mean, but we are just in a situation in the west where we have the most interesting noble and virtuous societies and there are real dearth in operational leadership, we can save money in the U S government by going around saying look we will contract quicker.

But you get paid less that would be obviously good for everyone.

That's unlikely to happen and so what we're doing is building our business. So it's less important and trying to get better at both <unk>.

Forecasting how this will happen.

And <unk>.

Getting locked into bigger and bigger and bigger contracts, but this will be an ongoing issue for <unk>.

We believe the severity of it will be less every year simply because of the strength of the U S. And later the handoff function in Europe on our commercial sales.

Thanks, Alex Our next question comes from Brad at Deutsche Bank, Brad. Please turn on your camera and then you'll receive a prompt county line.

Great and you guys seem a note, but we're rolling here you can just we'll.

We will do it without the with no audio that we got the audio excellent.

Thank you so much for taking my question I wanted to ask about partnerships, which you've spoken about in recent quarters, particularly with government contractors and other providers to USG and global governments, what proof points are there that support your confidence that these traditional providers to government are choosing to build our business with pellets here.

Maybe coming up with their own reusable IP that qualifies for meeting the requirements of the bar.

Hey, Brian , it's John or dig into the specifics. It is my belief that this space does not have comparable providers of software I do not believe there is a single provider in the world that can produce our software.

But they can produce things on top of our software that we can't produce and I don't think this is a marriage of love I think this is a marriage of necessity honestly if they could build the software we built they would not partner with us if we could build the hardware products, they build or quite frankly navigate some of the both American and international networks that they can add.

<unk> unless we wouldn't partner with them now the specifics charm is very much on top of.

Yeah, absolutely. So we are developing a large pipeline with partners that are built on this integrated hardware software and software offering there are a lot of places where were the prime and they are partnering with us deliver on this that's led to a lot of productive collaboration thats broad opportunities within the <unk> and the U S more broadly where they're the prime and our software can be.

A key differentiator going faster. There is also a maturity aspect of this were to Alex's point they've tried to build this offer that we have and have failed and that takes five years or longer and so many of them are coming up to the point, where they realized it would be much better for them as a matter of necessity to partner to go faster by the way on this 0.1 of the most important thing is driving our <unk>.

Software, but especially in commercial is and is that people have tried and tried and tried to build our product we have a number of customers.

That we've adapt we've been able to bring on board this year that quite frankly didn't like us.

And it's like but the product brought them back in.

Why did the product bring them back because the product is actually delivering value that is otherwise not available and you could imagine two years ago, you could spend $1 billion instead of spending $20 million a pound here a lot of those people have spent $1 billion.

And Lo and behold, they're bringing being brought back to this product often with people that do not inherently want to hang out with us, but the product is brought them back and in the U S government, especially the U S. Government has tried everything not to buy our product we had to see the U S government twice.

Just imagine how popular I am.

They still are buying the product.

It's not a love relationship always it's we bring you back.

Our products bring you back once you've used foundry.

Or whatever all the use cases, we talk about you used used.

You use nexus peering powering by powering Gaia and foundry to bring yourself home alive youre not eager to not come home alive.

We will buy the product that actually delivers that even if the person who is nominally or is actually in charge is unlikeable to you.

Thanks, Alex and Sean.

Alex we've had a lot of individual investors to my question is there anything you'd like to say before we end the call.

We have talented individual investors I really really there are a lot of motivations for fighting to win most of them are I believe the worst needs people fighting for it I think I think and know we're uniquely positioned.

Have great reverence for the people at <unk>.

But one of the real big motivation renovations for me personally are individual investors I have a lot of respect for the time you take whenever I read reviews of pound share of the person who actually use the product by the way, we're going to begin to talk to institutional investors.

But our primary we're going to do is use our product.

One of the largest most important institutional investors.

We're engaging with them.

Use our product first and then we'll discuss whether its differentiated the unit economics the margin numbers all of these things, but it all flows from do you believe the product is actually differentiate are these products. The best in the World can Microsoft Oracle Amazon Google replace these products quickly.

Those are the kinds of questions and the people who spend the most time on those questions are individual investors and now you have a lot of respect for that in year one of the many reasons that we fight.

Thank you.

Thank you everybody for joining us today to go through this walk through of the foundry platform.

There's a lot to cover so it'll be a bit of a whirlwind tour I won't be able to get to everything but I'll try to give you a representative cross sample a kind of a key parts of the platform.

Foundry behind the scene as a 100 plus different micro services kind of surfaced up as kind of two dozen different out of the box user facing application is actually a bit more in addition to applications that you can also build.

These applications stretch across effectively every persona and the modern enterprise. So we'll try to take a tour through kind of some of those key personas and call out what kind of tools or services or use kind of at every step along the journey.

We can think about the functionality and foundry kind of in three main parts.

So there is the data in ontology piece. So how are we connecting to different data sources data lakes data warehouses and bring that data together into a semantic foundation that can then be used by the entire organization. We call. This in oncology. So we'll talk about the journey of how we connect into different data sources refine them build out data integrations with health.

With health checks security and kind of build out this ontology is sort of a starting place.

I will then want to bring in kind of data science and modeling into that same foundation. So this is sort of the sphere of digital twin and modeling. So how do we think about leveraging business logic data science Ml LP models within the same foundation that we've built on the data side, so kind of trying to fuse data and models together into a.

Common foundation through this ontology.

And then kind of finally, we'll talk about how the combination of data and models comes together for different types of end users. So these are everyday user's doing core operational functions in the organization and this is 70% 80% of users of foundry in most contexts are sort of quote unquote nontechnical users trying to make use of.

The fruits of all that labor towards real operational workflows and different applications that are kind of existential to the business.

So let's start at the top with data in oncology, so what I'm going to do is first go into a first application here, which is called Monical sort of our data lineage graph, where the story of kind of how the data comes together into an oncology. So here you'll see this is a pretty notable example, so it's a little bit simplified supply chain oncology right, which is shown.

The kind of thing.

Different concepts I'm going to work with in my supply chain demo. So I've put together common concepts right like a supplier and alert a distribution center.

Plants materials et cetera, and each of these also have each of these objects have links between them will go into how each of these objects also has actions associated with them.

But the idea is that if we look at kind of any one of these objects. So we're going to go into the plant to begin with here to start.

Each of these is the culmination of many different steps of data integration. So if we zoom out here, what we see kind of in this spider Web diagram is the story of how we integrated different data sources to build this plan to object to start with on the right hand side.

So reading this from left to right. What we see here is we're actually pulling in kind of a couple of different data sources. Typically these diagrams are much much larger much much more complex, but this is just a starting example of us connecting two youll see on the left hand side.

SAP <unk> ERP system as an example of one business system Treasury Dot Gov, which is an example of an open API driven source for connecting to and then I have a time series source here too, which is a synthetic sensor that I've set up just sort of in my environment to work with.

And the idea is in every one of these steps every one of these sort of rectangles are nodes you see in the graph is one step closer to refining cleaning joining applying business logic to that data to get us to that Symantec representation.

So for instance, if I kind of take a look at one of these datasets I can always sort of pop up and say, okay. What does the state actually look like so let's take a look at the pie Spark code here that actually shows us how we're actually applying logic at this step in the pipeline.

If I select a different node I can go and take a look at the history tab and I can see that we have full data versions for every version of the data. That's updated in this particular part of the pipeline and this is huge because I can go back and take a look and say, okay, who actually updated this particular routine or this particular schedule of this data build.

What was the actual result for this version of the data versus the current version of the data and that can actually roll forward and backward through different versions of the data with the code and the metadata and each point in time. So it's a powerful ability for me to not only look at data pipelines as they exist today, but also be able to look at historically, how they might have changed over time.

There's a few other things I want to make sure that we call out here kind of in this part of the foundry platform. So kind of the build system for data or the data ops part of things.

The first is you are seeing kind of these lines connecting all these different nodes together and this is indication of lineage <unk>.

Just actually a concept that is intertwined with security in the foundry platform. So we think about how we're securing data and how that security will flow into the ontology and into workflows beyond that let's actually changed the view here for a second to take a look.

So we're going to change or to the permissions view and you'll see here that it's now showing me dynamically what my Permian permissions.

Commissions are across this entire pipeline so youll see its blue I have access to everything here, because I'm sort of the administrator here of this environment.

But what's actually happening is we're assigning permissions to the data sources. So the SAP system Treasury Dot Gov, etc. Basically the role based access permissions that might be coming in from an active directory system or some sort of other authorization scheme that our organization has and will then propagating those controls forward so that means that.

Every child will inherit the union of all the permissions of its parents to make that a little bit less esoteric. Let's use. Another example, if I change this permissions need which I can do as an admin to my colleague Jason Richardson.

We will see that Jason doesn't have access to most things here and thats because he didn't have access to anything except for the treasury Dot Gov source and so on the permissions slow he is going to lose access to more what we start to blend that source with other sources.

You'll notice he actually has one sort of exceptional piece of access here, where he should've.

You would have not had access originally which is on this OS or if they see clean dataset, because we actually chose to situationally expand as permissions in this case, but not allow that to sort of propagate forward to any other pieces of the pipeline.

The point here is that we can mix and match very different types of security models to get very granular robust results. So we can intermingle role based access controls classification based access controls and purpose based access controls all here. So we can have very fine grained control over who is doing what with different personas different parts of the organization.

<unk> foundry.

This is very much built from our own experience in the field is pound for engineers, where if you thought about just doing security based off of the location of data in some sort of final system or some sort of bucket location. It became really hard to actually make sure that security was robust as the organization organically started to pull that data into other places.

So data and security kind of flow together with lineage and it's immutable as you worked throughout the rest throughout the entire platform interact through the Apis so that.

That's sort of the first thing to call out here is security is sort of baked right and as part of the metadata and battery.

The second piece I want to call out here as we kind of worked through things as okay. So I build out these pipelines they get to be quite robust the refreshing all the time, how do I keep these things healthy.

So if I change this to now a data health view, what we're going to see here is this is a view into the parts of the pipeline that I've instrumented with data quality checks and I can always kind of pop. This open at the bottom to say, let's take a look at the checks that we have across this part of this pipeline so I'm going to take a look here.

One of these nodes and we can take a look and say lets Alameda health check here. This.

This is going to pull up a list of out of the box health checks that we can sort of assigned to different points in our pipelines. So is this data updating on time is it refreshing on the schedule that I expected.

The files the size of that I expect.

Are we seeing the cardinality of different data columns with the statistical results that we expect all of these things can be quantified here through point and click but can also be set through code as well. So we'll go in in a second on how we actually offer these transformations, but the data health component is key to keeping these pipelines healthy as our sort of refreshing all the time.

And so having these checks right here saves you having to go to a different platform to kind of see what's going on there before you can kind of come back to this platform here.

So, let's now switch back to our sort of our default view now, let's take a look at more detail on what's actually going on in these individual steps. So I'm going to open up one of these sort of these.

These new steps of logic here and we're going to go in and take a look at the repository and how this particular result was built.

So this is going to launching into another application in foundry called offering which is an integrated development environment, where I can see here, how I can actually make changes to this part of this pipeline with kind of full change management built in so the key thing here is that Youll see like we saw in Monaco the previous application.

This franchise con here at the top and what this allows us to do is treat data the way that we treat code and so instead of making changes on the Master branch and trying to then propagate my changes through which can cause all sorts of downstream issues on our careful.

What we've actually done is virtualized the entire change management process that would normally involve going through a <unk> environment, but have done a directly here in one instance, a foundry using logical isolation and a cloud native paradigm. So what I can do if I want to make changes here is I actually open up a branch version of this part of the pipeline.

And then when I've made my changes on that branch I would then create a pull request to actually merge my changes back into the Master branch. So if we go into some of the old pull request to take a look at what these proposed changes look like we could actually see line by line kind of what those changes were and we can also get an impact analysis.

If we were to merge this into master what would be affected and because security again gets blended through different parent permissions in different role paradigms, what security changes would actually occur as well and you'd see we didn't actually see any in this particular case.

This is a huge difference from the normal process of saying, Okay. If I wanted to make changes I'd have to clone all my pipelines and data over to a <unk> environment separate environment make some changes.

Then hopefully say a prayer and then put them back to master and hope that everything just sort of works here, we can actually do that entire process with total logical isolation total sand boxing here, we can iterate on these brands to versions of the data and actually use different kind of analytics tools and visualization tools with that branch data that we'll look at later.

And then we can choose to come back and then merge that into master and the most powerful thing about this is everything is sort of get based in terms of semantics youll see here the clone icon at the top which means if we don't want to go back to a prior version of master or any of the other branches, we have a full commit log as well so whenever sort of out.

Out of luck, if we've learned something in that we realized later have to actually be undone. We can always just move back like we can't even get with code to a previous form of the data.

And so being able to treat data like code, having the health checks have any integrated security. We think gives you the build system deal to go really fast when it comes to assembling and growing these data pipelines organically, which is always the case right and he has never sort of a sort of waterfall method to building. These things upfront that never changes, it's always a constant state of iteration.

And being able to try to keep up with the latest demands.

I want to show you guys one more thing before we go into the oncology and kind of bump things up a level.

We've talked a lot about being able to author data pipelines here the way that you can actually author pipelines in foundry is.

By default, we actually offered two out of the box compute engines that are auto scaling. So this is flink running on kubernetes for streaming computation and spark running on kubernetes for batch computation and so you can do any you can basically use any languages that have bindings to those two.

And so if we go back to kind of a repository here, we will see where we're using pie spark. We can your spark sequel, we can use Java, we can use scholar, we even have bindings for groovy.

And so we adhere to open standards.

All the way through when it comes to coat offering and critically when it comes to data formats as well. So if you ever want to look at what's happening with kind of the resulting data and any step of any sort of transformation in foundry.

If we kind of look at what's happening here in the preview or we go to the history tab here, we'll take a look and we can see that any of the data here is effectively just data that's in parquet or in <unk> or in sort of any open format that I expect so I can always inter operate easily with any of the data here in foundry and critically even though were using spark.

Flink as our run times, if we take a peek at what's actually happening under the Hood with foundries build service well see its just calling those to run times, because we have registered them with foundry and so if you have a high performance computing environment or you have something thats, an external compute engine like youre using something like another spark environment.

Alright proprietary run time, we can register that runtime with foundry as an external build worker and when we refreshed. These pipelines are good the orchestrations founder will know at certain steps to go and federate that computation to that build work.

Which allows you again to have a totally extensible compute framework here with what we think there is great starting defaults, but you never sort of locked in to just using the compute that we provide as part of <unk>.

Alright.

So one more thing to show you guys and then bump it up a level. So we talked about being able to compute these pipelines.

Offer them manually what about data systems that we've seen over and over again like the SAP and other ERP and the world of the Crs.

So what we've tried to do there is try to say can we actually implement what we think of a software defined data integrations, where you don't have to mainly redo the kind of cleaning and transforming work that we think is actually quite standardized and possible through kind of ml assisted techniques.

So what I'm going to do is actually open up the kind of data source for this SAP instance, we're connected to internally and.

And we're going to take a look here and kind of how this looks so this is an SAP explore that we have kind of as a kind of a rich connection interface here and you'll see it's showing me kind of all the modules and my ERP a state like my and my data dictionaries and it's also even pre populating some workflows I might want to do typically that are common for SAP.

And the power here is that instead of having to think about browsing through all the individual tables myself, we have a native connector connecting at the application layer, which is called net labor and we're actually dynamically pulling out all the rich metadata that actually tells us much more than just the raw table information and S&P like we're able to see.

All the German acronyms auto kind of translated for me were able to see kind of a foreign keys the relationships and the idea here is that we can kind of derive out that federated lineage, and then auto build pipelines and even starting objects.

In a matter of hours as opposed to weeks or months of manually recreating SAP.

Transformation on the foundry side.

And this applies critically not just the standard modules like mmm or SD or Didi, but also Z tables that you might've customized because we're not just doing static mapping we're doing a dynamic in France of the meta data through net waiver and so if you have a very customer SAP state our CRM estate and many people do it's very simple to install the SAP connector, which.

Certified native.

And be able to browse around and then kind of in a shopping cart experience pull in the data that is of interest to you into a set of pipelines and foundry that you then start to work with and actually that was how we created these green pipelines here that we saw before and we saw what was actually produced was PAE spark so even if it gives us a starting set of pipe.

<unk>, we can always choose to expand that whenever sort of given that our black box output that we can't change and so that's an example of for more common systems. The CRM. The ERP is the others of the world.

We're trying to invest and speed time to value speed for our customers to say, we don't want you would have to redo all those integrations, if it's not required.

Alright.

So we've talked a lot about in this first mile kind of getting data into the ontology now, let's actually talk about the oncology itself in a little bit more depth. So I'm going to go now into one of these objects and kind of pop the hood on what's actually happening here, so to do that I'm going to jump into our ontology management application and this is a view that a data governance.

Kind of team member or a part of the business working with it might start to get involved or they might say, okay. We've done all the integration work now let's decide how we're going to actually make this appear to end users right. So everything from this plant object. We've been working with is it a normally visible object is it hidden as a prominent is it something we're experimenting on or is it actually an active object that <unk>.

Or should we be using dalian for different workflows.

Literally how are we mapping in every element from those kind of golden kind of datasets, we produced into object for so things like geospatial properties. How are those coming in what does the API name for these different properties in the actual Ids as well because they're going to want to use this with all of our different tools in lots of different kind of architectures.

And Youll see we have all this information kind of rolled up here right, including kind of the link types.

Usage information and on the Sidebar here, we even have ways to go in and say I want to apply additional security here restricted views I can do that here as well I can also prime these objects for different types of workflows. So it might be the case that this plant is going to be primarily doing simulation based workflows or geospatial workflows and what we can do here.

Preset, some metadata, which you can always manually set ourselves.

Just get US again time to value faster, let's get these objects up and running so we can now start to work with them and.

And it is you do a little bit of configuration work here, which again can be <unk> to a large degree and business users can then dive in and immediately start to make use of this data and a host of different out of the box applications in foundry.

So let's take a look at some of those applications that kind of come to life for a business user for somebody who is non technical who probably isn't starting their journey in foundry in the data lineage application and the coding environment or kind of in the oncology management interface, either they're probably starting in an application like object explore which is a kind of a point and click way to be able to browse around different information in there.

Analogy.

And sort of feel very familiar to folks who are kind of searching for information, we're kind of using kind of different sort of web applications day in and day out so youll see here.

And then I have a lot of different objects here that our surface to me an object explore I've got my aircrafts by delays my flights for my aviation demo I've got some marketing objects here KOL interactions campaigns offers things like that and to search for something at the simplest searching for a type of object and looking for or any individual object property. So in this key.

So I want to look at those plants, we saw before and what this will do is actually launched me into an investigation or exploration that's been kind of pre configured for me to be able to start to explore this data.

So this is it's a very simple sort of set of Visualizations here right showing the kind of where the where the plants are geospatial based on that data that we have fixed it's showing me some rollout statistics like 30 days of inventory production capacity things like that and if I want to go into a 360 view and one of these plants I have a list here to be able to start to kind of dive in and take a look.

As some of this information that kind of a higher level and the idea here is that it's intended to feel familiar again to folks who have worked with kind of business intelligence or visualization software.

And just roll up all that information that's coming from all those different data sources in a single view right. So I can see information about like again, the geospatial information its address.

Unstructured data like a schematic, maybe some risk factors or material information coming in from the ERP Mes systems, and even things like timelines as well here.

These like $3 60 views can look very custom and very different depending on the user type or the business unit and.

And you might imagine that there are certain users that want to see Iot information or video information embedded here and others that want to see financial information or kind of more kind of CFO style views.

So you can have different lenses or different kind of views assembled on the same types of objects here and the power of objects explore is it allows people to traverse the different objects in a completely security compliant way.

And on a scale, which you could have millions or billions of different objects here, where we're actually allowing people to kind of traverse in a way that makes sense and isn't isn't overwhelming or is it too granular or unfamiliar for folks who are familiar with kind of kind of more business centric tooling.

So this is just one view into the data, though I think it's important to realize that so that's point and click it's kind of based off of the objects kind of in this paradigm, but from somebody who is a more technical user like a engineer or somebody who is an analyst that maybe is used to working with series data I might use a sibling application of Opex, Florida, which is called quiver.

And quiver here actually allows me to look at charting information on time dependent data coming in so this is going be streaming data coming in in real time.

And in oil and gas context, or in utilities context, and as a user who might be using an application like this I'm probably thinking in terms of not just point and click but in terms of formulas and integrals and interpolations in scatter plots and I can actually derive new data here just to your point and click. So I can say give me the summation of these two series produced another series.

Let me actually look at the.

Ongoing differential between these two pressures and let's produce another pressure I can even do things that are more sophisticated like let me do a search across these series for any time the difference in these values and beyond a certain threshold or let me actually do a point and click training of a first approximation machine learning model, So I actually want to build a classifier and.

Maybe I kind of Tinker with that here is somebody who is an engineer and then I Wanna be able then collaborate with somebody who has a data scientist on my team. So I can export kind of my first approximation to pipe to Python code kind of a full library semantics for interacting with his time series data in Python as well.

And again the power here is it looks pretty technical looks pretty custom, but at the end of the day I'm just interacting with those objects and my ontology like we saw before so I can just kind of browse for more objects and plop them onto these plots to be able to use them.

So another view into that ontology that we built that can be surfaced by another type of user.

I think geospatial, let's say not just Iot centric or maybe object centric.

Want to use an application like the foundry map, where I'm looking at kind of all my plants and distribution centers are kind of here and I can kind of toggle layers on and off as I expect.

Might want to say, let's take a look at for this particular distribution Center show me kind of in the graph oriented manner. All the customers connected to the distribution center and its going to visualize that here and allow me to kind of explore the different flows are kind of supplier logistics information that might pertain the dependencies between different objects.

And so this is another common view again looking at the same ontology just through a different lens.

And so this hopefully gives you an idea of if you do that work of kind of hopefully being able to rapidly integrate and data from different sources right on data lakes or data warehouses and layering in those health checks, having that security built in allows you to quickly assembled a semantic model, which then opens the door to immediate access and exploration for <unk>.

Different types of users, whether it's an object explore or it's in quiver orange in sort of a map based application.

And that really is sort of the journey in a very very short form of kind of the first mile of the platform right, which is sort of.

Data in oncology.

If we now think about how digital digital twins and modeling and data science come into this it gets really exciting and so the first thing I'll say here before we kind of go into the data science part of foundry is.

Can you talk to four or five data scientist inside of <unk> and ask them what tools. They love to use you'll get 10 different answers right people love to use.

Foundry is built in machine learning capabilities and answering capabilities, but also they love to use data robot data <unk> Sage maker Azure ml you name it and so we've tried to build this part of the stack to set to kind of what the philosophy in mind of build anywhere and then when you are ready to deploy into operational applications, let's give you a streamlined way of doing that.

And so as an example of how I might build a model end to end using foundries existing kind of data science tooling I'm going to open up and kind of switch persona is again. So we have talked about the data and hearing persona. Some of the business personas now, let's think about the data science persona, where I am now here in an environment that is.

Again looks kind of graph like or sort of.

Sort of lineage like but is intended to feel very familiar to those who have used other data science tooling. So by pop open sort of one of the steps in my model here and I kind of zoom out youll see concepts like training sets and because of the feature creation kind of all the way back to the objects I'm actually pulling data from and so every one of these.

No just a step of Pi Sparc base transformation I can be using pandas intermixing <unk> or spark sequel.

And the joke that we have internally is that the most interesting thing off the bat to data scientists is the import data, but it's being able to say hey, you can get cleaned consistent access to that data that you've built into the oncology or anywhere else in foundry and as the data scientists spend 80% of my time.

Rebuilding these data pipelines myself I can just tap into the good work of my data ops colleagues on Monday to engineering colleagues and if anything goes wrong with those data pipelines like the ones. We saw before I can get immediately alerted to that because the health information will flow down to me as well and so I have a stable rich healthy foundation to build on and I can then kind of use.

Use all the packages and all the techniques that I am familiar with here to actually build out kind of different models and to end batch or live inference on points, and then actually plugged them directly into different operational workflows.

And of course, I can make use of kind of all of the packages I would expect so we can mirror things I can't afford to have you have your own package managers.

Have the added benefit here of kind of you can set up custom profiles that not only have different sort of compute.

Profiles associated with them, but also if you have particularly sensitive or license restricted libraries, you want only certain users to be able to use we can enforce that to the same security perimeter, that's enforcing data security and ontology security that we saw before.

So it allows people to make kind of secured use of commercial libraries as well.

And so the idea here is bill here, we think it's a great streamlined set of tooling or you can also connect in the same way that these tools are connecting.

Directly from the data robust data itunes decision makers in the world to the oncology through open Apis or through Python clients that we provide to be able to build your models in those environments as well.

And you can connect via rest via JDBC via direct file access we have and we have kind of a foundry version of <unk>.

It's super streamlined to get data sort of in and out of foundry and not just data, but also lineage information the semantic representation. All the version information all of Thats available to the Apis and so the idea is you build your models and built some hearing your thoughts on another tools now I want to be able to register them and use them in a variety of different end user applications and so what.

I'm Gonna do now is open up my model objectives Library, and this is kind of mission control for all the models that have been working with in this particular instance, a foundry that I have access to.

And a lot of this paradigm was informed by some of our work in the defense space where.

This is kind of this three part sort of approach to deploying operational AI there as I first wanted to define the modeling problem right not actually think about a particular model, but what is the objective I'm actually trying to run with here, probably then testing dozens if not more models against that objective and then I'm figuring out how I want to then deploy kind of the winning <unk>.

<unk> into production.

And so I'm going to open up a few of the objectives. We have here. So we can take a look at what's actually going on.

And the critical thing here is this does not mean deployed in foundry necessarily several of these models are living in other environments in Azure and AWS and through managed services like data robot.

And so we're just registering these models here and we're going to basically have a holistic approach to comparing them as we think about which ones are going to be deployed into production.

So if I open up my customer demand forecast sort of objective here youll see at a top level. It shows me all this kind of $3 60 information about the model right Hey, what are we trying to do with this kind of what's the approach kind of broadly defined some information, but the latest releases relevant files that actually informed over can use to train. This particular model that were commonly used by every different.

Model that we tested and even lineage information about the applications that are going to be depending on this model.

And also as we will see a little bit later on some information that we can about how we're actually changing this model with other models inside of simulation environments.

And it's all of that is kind of rolled up here for me.

And I can go in on the Sidebar here for instance, take a look and say, let's take a look at all the models that we've actually proposed to be kind of deployed against this objective right now see some for my colleague Matt Some for my colleague Andrew and I can always look back in time to say, okay for like this last though model we have.

You can see the details about where it was built when it was built and also if it was rejected why that was until kind of the four kind of lifecycle around kind of the approval process like we saw with data our training data like co arbitrating data like code. We're also kind of treating models like code and kind of the same way the metadata around the approval process.

And if I kind of switch over to another objective here, which might be my sort of campaign model I can even go into the comparisons tab here and take a look and say okay. So we're different models that we're assessing we rejected one of them, but let's still added into the comparison, we havent random forest model in SPM model and linear regression model and I want to like compare kind of the key theory is the key criteria across these.

Model, so things like the RSC curves and how basically the different kind of core dimensions of evaluation are performing across the different models.

And so you can kind of you can customize very easily kind of what these core dimensions are and you can even segment out things that are sensitive so there might be.

Personally identifiable information or things that have to be locked down and you can actually cordon off that kind of feature comparison or that kind of dimensionality comparison inside.

Instead of a different set of <unk>.

Parts of the objectives that are only open to certain types of people are certain folks in certain permissions.

And so all of this information is here and the idea is that once you've sort of made our selection and tag the model is being ready for production.

The kind of key step here that kind of coordination is we're then going to actually buying this model to the ontology. Now. This is key because this is basically giving us a type system for the models and so what you see here at the bottom here with the modeling objective API.

The winning model, which in this case it was ice age maker model is actually going to take as two of its inputs two properties from the ontology.

From our supply chain oncology, where we're working with his entire time, so it's going to take the cost.

The price into this live infants endpoint and then everytime it returns a value for that point, we're going to basically take those three values demand capacity inventory and map those back onto the objects and our oncology as time dependent properties.

Case model depended properties that could be time dependent as well and so the idea is that we have configured the ontology to basically have to.

To recognize to be aware of the fact that it's going to be receiving values from our model.

And so this is huge because what this means is with this binding in place I can now use this model that sort of one the different kind of it's been competed it's one it's now bound to the ontology I can now through that ontology binding insert the model into not just one application or five applications, but dozens of different applications and sort of seamlessly have it appear to end users.

And so as an example of how that kind of comes together here in this kind of takes us to kind of our final mile which is sort of operational applications.

Let's open up the supply chain control Tower example, and take a look at how the data and the models kind of weave together for an end user application.

And so what we're going to open up here is just an example of an application then again it might be a starting point for many users in the enterprise.

This is sort of my top level control tower, it's showing me some more statistics at the top.

And I see here that we've got different objects here on a map as an operational user I, probably don't care about what an ontology is or kind of all the things that we've been through I, just thought that I havent job to do so I see we have a plant here in Michigan and I need to go take a look at I see that production capacity is starting to dip and I want to know why maybe I received an alert about this and so.

That alert can have come in via email or could have come in via text or kind of any sort of integration with the messaging system and I can go into my alerts tab now and take a look at kind of a prioritize triage set of like a kind of different alerts that I can start to action here. So youll see theres all sorts of things that are coming in here and maybe I just want to look at those that are high priority. Some of the filter here on the left hand side.

And I'm going to take a look at one of these alerts here, which is a business interruption related to COVID-19.

And Youll see that this is pulling together lots of different information from everything we've covered so far right. This alert was generated maybe by a model pipeline.

Is pulling in data from the supplier and the plants are different links in my ontology.

Showing me the impacted plant this pertains to and its actually showing me recommended actions as well which are in this case coming in from a model that I've built.

The critical thing here.

So we're not just stopping with looking at this alert and saying well okay. It looks like we have a problem here, we're saying we want to be able to take action now to actually do something in response and Youll.

We have different actions that we've actually pre assembled for these users.

Canceling an order partially fulfilling an order we're actually going into more advanced actions like reallocation that we'll look at in just a second.

But if I hit this cancel customer order button, you'll see that it allows me now to sort of it kind of pre configured in action I can take here on this particular alert or a related alert and I can then choose to submit this in a way that will be non destructively captured back into the ontology and you might be asking well how does this work right I don't want any user to be able to take any action.

On any part of the oncology and start to update things right that would be chaos, we cannot have people modifying values Willy nilly.

Turns out the way that actions work is they are actually part of the oncology at the core level. So if we go back to the ontology configuration interface that we saw before we were talking about bringing our plant together and all of the connections here. If we go back to kind of the main page will see their object types of course that we're defining there are linked types, which we've seen all throughout.

But the critical third element is actions as well and this is the dynamism here that kind of makes the ontology truly special.

So we're not just talking about kind of semantics. We're also talking about kinetics were talking about the kind of actions that can modify the state of the ontology in a compliant manner.

So if for instance search for that cancel customer order.

Action that we saw before we will see this is in our library of actions.

And if I open this up Youll see all sorts of related information like we have with other parts of the ontology.

The API name for this customer this cancelled customer order the actual idea that we can use we can programmatically interact with this and the actual logic used to create this particular cancel customer over action. So we see here.

What objects it pertains to what properties that we'll be updating what parameters it takes and even what validation that uses as well.

And this is all kind of configured here through point and click, but it could be configured to a much more kind of robust kind of code based interface as well using the authoring tools that we saw before here and kind of the more data ops part of the stack.

And so with this action essentially defined every time I want to cancel a customer order I can do so in a reliable repeatable auditable way and so that means that every application builder isn't implementing their own version of what it means to cancel a custom order. It means they simply use this action out of the library and it knows intelligently.

How to basically modify different objects it knows what sort of precautions to take so if I'm a junior employee it might be that one I click the cancel customer order its aware of who I am and it has to go and then get approval from my manager before it actually.

Gets confirmed and then it might be that the last part of this as a web hook or some sort of write back routine, but then actually synchronizes. This change back to an ERP or back to Mes system.

Again, all of that is a pretty complex flow for an action or a piece of logic that I don't want the application, but I'd have to think about I just want them to be able to embed that piece of logic here as an action and have foundry at a system level be able to deal with all that orchestration that happens thereafter.

So we've kind of alluded to this but what this means for the application builder is that even though these applications. We've been looking at here in this particular supply chain control Tower example, looks very custom it's actually all just built using a low code application builder in this case, a no code application builder and if I want to add something here to the top for instance, it's very simple.

A point and click my way into adding a new widget here in this particular deal so that could add an object table for instance, and it's super simple to start to wire and different things into these news right because I'm not dealing with pulling data into kind of a kind of siloed database and trying to do things in a one off manner I'm connected directly to.

The ontology and.

And so I can easily open up a set of objects <unk> been working with before like the plants and I can start to actually pull them in here and take a look at how they will actually get bound to these widgets that we've setup.

And so here all type and our MSC plant in this case and it will now be able to pull in all of that data in a way that is aware of who I am and my security profile I can add all the properties here and just like that I've added in a new widget.

And the same way that I can add in data I can add in those actions as well and what this means is that as an application developer I no longer have to think about the full stack of considerations myself every time I want to deploy a new operational workflow application I don't have to think about setting up a new hosting environment. The databases the indexing the compute storage thats.

Maybe non database how am I, capturing back feedback from users how my binding models to the different application stay it sounds like capturing feedback from those models all of that is taken care of below the level of the application.

Of what we saw in the data integration and the setup of the ontology and the actions in the actual wiring in of the models that come together in these objectives and the actual binding of the models to the ontologies and these objectives as well, which means that you can go fully feature rich compound applications and a couple of hours or a day.

Two as opposed to spending months or quarters of effort.

So we think this is a really powerful part of the platform and oftentimes. This is this is just an example of kind of where the kind of operational capability get surfaced.

What we've been looking at here is one example of an operational application builder called workshop. There are other application builders in foundry as well. So there's another one called slate is sort of a sibling application and the key thing here is of course, you can go and you can build your own custom applications as well.

You can kind of do it in an object oriented manner, where even do it in a tabular manner or a more classical manner as well, there's kind of a full set of documentation here around not just sort of interacting with existing applications, but building your own applications using the Apis and foundry are using every part of what we've talked about kind of throughout this demo.

Awesome now I have one last thing to show you guys and then then we'll sign off here for this particular demo.

And that is kind of how these things start to come together in kind of a more strategic kind of digital twin paradigm.

So wanted to open up here is one final application here in foundry that we call vertex vertex is kind of like the map application and boundaries or sorry, the kras applications. Valerie you might think of it shouldnt be kind of all the relationships here kind of between all my different objects right. Like if you were just to look at and we are starting to hear you might think this looks kind of nuts right. There's so much going on here, but.

Now we know exactly what it says right. These are this is just showing the kind of the kind of connection that I've built into my ontology right, where every one of these object types has a lot of data being integrated into it and then we've also overlaid the different machine learning models and data science approaches as well. So in this case. This is an example, that's based off of.

One of our medical supplier customers kind of loosely based off of Anonymised of course on the work. They did during COVID-19 kind of the early phases of Covid when.

When they need to really map out kind of the full flow of how they were building and shipping ventilators from everything from their upstream suppliers to intermediate goods to their plants, all the way down to their actual distribution centers and their end customers themselves.

And so what this is showing here is sort of the birds eye view of the entire value chain, right and being able to kind of scrub backwards and forwards in time, and saying show me, where alerts are firing where inventory is getting to dangerous levels wherever like basically experiencing kind of fundamental dislocations right and I can always kind of go backwards and forwards and kind of scrub through kind of different points in time here to be able to look.

It kind of where things were in kind of the danger zone at different points in time.

And so thats useful of course to be able to kind of be able to get visibility into the whole kind of connected chain here, which was spread across dozens of different data systems and models before but the more important thing is I need to now be able to kind of update my strategy and actually.

Operate now in this sort of very volatile environment, which it was last year right. So there were simultaneous supply and demand shocks and kind of every prebuilt model was kind of thrown out the window and so they need to be able to simulate kind of different scenarios that were exigent day in and day out what would happen if one of our suppliers like completely offline given the pandemic.

What would happen if one of our medical center customers had 10 times normal demand. They normally have for ventilators, how would we actually be able to meet these different requirements. Given how we are currently doing allocation. How we're currently doing different supply tactics and so what we can open up here is actually a simulation tab to think about running those simulations.

The scenarios directly on top of sandbox versions of your world right of the full oncology.

And so we can actually stimulate directly here given that we have it in the models connected what would happen if that supplier went offline for instance, and you can kind of run. These simulations as kind of contained case studies and so in this case I can make a very trivial change right, where I can say, let's literally just update the valve price per unit dropped to 400.

Let's literally now stimulate across the entire chain what effects that would have across kind of every dependent property that we have here.

It kind of modeled in the oncology and.

And so that's a very trivial example, we can now kind of have this sort of modified simulation of the scenario state continuously run as new data comes in and we can kind of continuously assess what its impact is and as you can imagine it gets much more complicated and sophisticated what you're testing compound changes to parameters across not just one model, but potentially many.

Model. So you can say I'm, not just taking one modeling, but I'm going to chain different models, together and run a compound stimulation across kind of the entire state or the entire value chain.

And this is I think where we really see kind of the progression of customers working with foundry kind of kind of move towards where you are of course, working with point applications or kind of starting workflows to begin with but really it's about the connected operations paradigm that unfolds by virtue of bringing together the oncology kind of incrementally.

As an example in oil and gas.

Actually they were doing kind of point optimizations and simulations for one part of kind of a connected value chain right. Just the subsurface. They had one team working on the subsurface integrating data building models and kind of just looking at that element of what was going on.

But of course, it's part of the network right. The subsurface connects to the topside connects to the flow lines connects to the onshore each of those different functions had their own kind of siloed.

Sort of stimulation and optimization and data regimes and to really understand how to optimize production as a whole for instance, they had to connect together all of those different parts of the organization into one oncology connect together all the different models into a single chain representation and then they can actually have to run these full scale continuous simulations and begin to.

<unk> optimized production in that case that kind of a full scale, they're starting to work towards really connecting a reconnecting their operations.

And we see this sort of again and again kind of across whether it's sort of industrials or its finance or its retail or health care you name it sort of that being kind of the north star for US how are we helping our customers really re engineer and re integrate their operations.

So I know we've covered a lot today, so I won't kind of I won't keep keep yammering on longer but I just want to make sure that we can recap here a bit of what we've been through just to leave you with this so we started at the top here with how do we think about kind of data in oncology, so data coming together for many many different sources structured Iot unstructured geospatial you named.

At over 200, plus out of the box data connectors that we built for foundry.

And extensible, so you can build your own.

And being able to bring kind of all of those kind of different pieces of data together into a semantic representation that allows for people to instantly start to look interact and kind of a wheel that data.

The second piece is now how do we bring modeling data science linear programming into that same frame. So we can actually overlay intelligence into that data and build out not just a semantic representation, but as we said before our kinetic representation. So how does how do things actually flow or get dictated and business process based on what the models dictate now can you sort of build.

Models within foundry, but also outside of foundry and then buying those models to the oncology. So you can then insert those models into operations in a way where you are both allowing.

A very robust and safe deployment of models, but also a very rich feedback to model builders, who can know because of the data lineage. The motto lineage kind of all the book keeping that foundry does exactly how users are using those models in different applications and also in external application that are using the Apis.

And then finally, we talked about kind of how this all comes together in kind of again, we're 70% to 80% of foundry users live which is operational applications, which you can build using low code no code or pro code application builders and can also of course be surfaced.

Directly via the Apis, we want to build external applications as well.

So there is a lot more we can cover in follow on sessions, you've only shown you kind of a glimpse of kind of how all these different tools come together in foundry into a cohesive platform. We've only shown you a glimpse of kind of all the different interoperability patterns that exist in foundry.

I'm showing you a glimpse of how you can actually kind of deploy workflows using a workflow catalog out of the box get started really quickly doing alerting workflow or an entity resolution workflow or plant 360 or resource allocation workflow and all the pre built templates, we have for getting started super quickly.

But we will save a lot of that stuff for another time. Thank you everybody for listening and I hope we can talk to you guys soon.

If you can't tell.

So so so excited to be able to share. This with you know looking at whole, Jeff along I just wanted to to take care of a few things before we get into the roadmap.

But the most important point is to make sure you have a play around with it yourself checking out all the links that you need will be below the.

The mine map itself the roadmap itself.

Github repo to go along with it which has the full version image of it.

Most up to date link will be the the map itself.

There is also the slides in the Github repo, if you want to check them out.

So.

And there's also timestamps, so feel free to jump brand you don't have to watch the whole thing just jumping around a point set suit you best.

And if there's anything else is missing if you have any questions, we would come in below and I'll get back to you.

Nonetheless.

Enjoy.

Oh Rado Rado awry.

Welcome to machine learning roadmap for 2020, 8-K, a machine learning flavored visual interactive, leaving my map slash compass, well that was a bit of a mouthful, but let's not spend any more time talking about it let's actually see this thing.

So if we come here.

This is interesting we got some colors. So we got some nice icons the good a little white box with machine learning in it and so what we're going to do in this video is basically explore the field of machine learning as much as we can in a relatively long short ish video I mean, how textbooks have been written on machine learning of course, but that's not what.

We're here for we're going to go through some of the main topics of machine learning such as machine learning problems. The process. Okay that step similar machine learning project. The resources like how you might want to learn machine learning. The places you want to visit the tools you can use to get the job done and after role since machine learning is.

Basically mathematics under the Hood, we will see what some of the main topics are in terms of what kinds of math runs and machine learning algorithms, but how we're going to do this we're going to be playful and that's what I want you to do with these resource here by the way all the links that I mentioned throughout this entire thing will be in the description below so you'll be able to <unk>.

This is well go through it come to whimsical. This is the tool we use by the way whimsical. So if we click this little button here.

It's going to expand out and look at that.

We're going to jump through some of these maybe not all of them because again don't want this video getting too long if you do see a little purple button like this you can see some more commentary because other than filling up this space I've just added some comments there and if you want to leave you Ron comment you can add one here.

I've got mine here you.

You can probably place you Ryan as well.

Now, let's get started what I've got to go along with this is a little presentation. So what we're going to start with is a question probably smart to answer.

And doing in machine learning roadmap video and that is what is machine learning you might be asking as a curious internet dwelling user.

Maybe you just kind of said that and so for the sake of this video I mean, you could google's machine learning you could get hundreds of different definitions, but for the sake of this video to keep it nice and simple we're going to treat machine learning as turning things.

A data into numbers and finding patents in those numbers.

But wondering well how do you find those patents and numbers.

While the computer does this part Hal Matt.

Math and again, we'll cover a little bit on this later now if you want another one line of definition of machine learning machine learning is the field of study that Gibbs computers, the ability to learn without being explicitly programmed now that was by office Samuel I think that was almost over 50 years ago now.

So that's a key point that without being explicitly programmed so let's jump into.

Traditional programming, which you might call software one point versus machine learning software to point out that you might be wondering what's the difference between the two what's the difference between traditional programming the difference between machine learning well before we even get into the different. So I just wanted to let you know is that.

Machine learning to put it into practice requires traditional programming to exist, whereas traditional programming does not require machine learning. So although machine learning is amazing remember you will need some traditional programming skills to be able to use it.

So if we come back here you might be wondering weighted software to pointed out come from well. That's when we can go to the roadmap again, if I mentioned anything it's probably in the roadmap. So if I search software to point out.

There we go.

This is what I want you to do as well as once we've had a little bit of exploring again, we'll go through this in a minute.

In the interim part just now so software too.

Click on the link.

Are we going to blog post here, sometimes see people referred to neural networks is just another tool in your machine learning toolbox.

Neural networks and not just another classifier they represent the beginning of a fundamental shift in how we write software. They are software 2.0. So I'll, let you read that I'm not going to read total, but this is the kind of way that you can explore this roadmap is at every level topic here. If it requires more information I've put a link there so be sure.

To check those out.

Let's go back to the presentation.

So let's have a visual demonstration of what traditional programming is compared with a machine learning algorithm.

In traditional programming, let's say you wanted to cookies favored roast chicken dish look at that that's delicious I know, we're talking about machine learning, but it's always a good time to talk about food.

With traditional programming you might start with a couple of inputs. So you say you've got your box of vegetables, and the raw chicken you might program. These steps you might go step one cut the vegetables step to season, the chicken step three get the oven free hated what temperature, who knows that's up to us that might be a Pos down recipe from your Cecilia.

Your mother cooked chicken for 30 minutes add vegetables, and then if you've done all this correctly. If you started with the run ingredients you have started with the <unk> rules that you've programmed yourself.

You might end up with this beautiful roast chicken, whereas a machine learning algorithm usually starts with a set of inputs and a set of ideal outputs. In this case the ideal output is alpha caelian grandmothers roast chicken recipe.

And it might look at 100 or a thousand different examples of these inputs and outputs and then it's going to figure out the instructions.

To watch the recipe is so rather than us explicitly writing. These instructions. This is a machine learning algorithm figuring out patents and data actually before it could even figure out. These patents it would have to figure out some way to translate these inputs and outputs into numbers.

And how you do that is going to depend on what problem you're working on but.

The overarching process remains the same turn youll inputs into numbers and then let a machine learning algorithm figured out the patent in those numbers.

Now you might be wondering okay that sounds pretty cool so why would we use machine learning.

Now this is a quiet from another curious and perhaps even more curious than before Internet Wella, which is also maybe yourself. If you like me I'm pretty curious about things.

And so the good reason would be why not but.

Let's cross that out for a second the better reason is can you think of all the rules.

Now what I mean by that if we think back to our roast. Chicken example is of course for that simple actually it might not be that simple depending on how complex sure Cecilia and grandmothers roast chicken recipe is but for a more complex problem. Do you think if you had to ride out say a thousand different rules will have a good example of this coming up soon.

But if you looked at something and thought if I wanted to teach them how to drive.

Do you think you could figure out all the rules by yourself.

Well in my case, I know, probably not but if you had enough time, maybe you could.

So let's have a look.

What's the number one rule of machine learning.

If you can build a simple rule based system that doesn't require machine learning do that.

And again, maybe this rule based system is not very simple maybe you got a thousand different rules of how a cautious approach of driving scenario and it backs out your driveway, maybe that's one row back out driveway avoided costs drive down the hill. That's another rule 10 right at the stop sign that's another rule et cetera, et cetera don't hit people, that's really the number one rule.

And now this is a quiet again from a wide software engineer action.

Actually it's real one of Google's machine learning Handbook, which if we come back 12 roadmap.

Zoom in.

These things are little bit fun to play around with if we go here machine learning actually if we just search Google machine.

Machine.

Third.

Google Machine learning crash calls, yes excellent are actually.

And something in here the supposed to be.

Yeah.

Machine learning rules.

Machine learning, one I want I should've, given that something better machine learning one O one I'm going to put in brackets here.

Machine learning rules, saying this is a search problem otherwise.

Machine learning one I wont actually 43 rules. So if we come here when.

When you look at a link.

Rules of machine learning now this is by Google Some big dogs in the machine learning failed. If there was probably one company thats using machine learning everywhere, it's Google.

So if we come down here for some reason my scroll doesn't want us grow that we've got before machine learning rule number one don't be afraid to launch a product without machine learning as we talked about before.

Software at one point <unk> hand coding everything can exist without machine learning, but machine learning can exist without software at one point I'm sorry, what.

What is machine learning good for and I, just realized that that should probably be what is machine learning good for rather than this incorrect grammar.

Anyway, it doesn't matter.

Number one machine learning is good for problems with long list of rules just like we discussed before with the self driving car. If you wanted to code up a self driving car you'd have to coding everything such as stop at stop signs white for physicians across the road avoid hitting that beautiful little dog, that's just run out and chasing evolve.

When the traditional approach files machine learning may help.

Now to continually changing environments. So again with the self driving car. The reason why I use self driving cars, it's fairly easy to imagine these kind of scenarios. If you are driving along your and your local suburb you might know how to drive around pretty well. So you might be out of code those rules for your local sub a pretty spot on.

But then if you ventured out say into another city are continually changing environment. The rules that you made for your suburb of where you live may not work very well. So in essence machine learning is good for problems, where it's required to adapt.

Learn about new scenarios.

And finally, discovering insights within large collections of data.

Now can you imagine trying to go through every transaction of your let's say large company has ever had by hand like I say you wanted to group together certain people like what are people purchasing during winter what people purchasing during summer if you're Apple what kind of people are purchasing iphones in now I mean, that's really not a good example, because apple is pretty big.

On privacy and they don't like collecting that sort of data, but maybe they will analyze it without having your specific user IV in there.

So again, if you tried to discover those insights within let's say you had and imagine an excel spreadsheet with $10 million plus Roes and Youre trying to go through all of your customers purchases.

How long do you think that would take.

And so they take a fairly long time, but these things the problems with long list of rules continually changing environments discovering insights and large collections of the data this way of machine learning really flourishes.

So we want to see an example of this in practice. This is from Tesla's autonomy day video.

If we go here.

This is actually a really cool video you can probably see why I was using the example of the self driving car. It shows you how Tesla uses machine learning and production. So they are probably another one of the biggest companies that is actually using machine learning extensively in their products. So if you imagine.

We'll have a data source.

Which is the cause and they collect data from the environment. These cars have a series of I think it's eight cameras. They have a right. So all of these cars are collecting information from the environment now if you imagine there is a camera on the frontier and that camera is taking photos of what the car can see essentially now the job a machine learning algorithm is to turn.

These photos here into numbers to find patents in those numbers maybe it goes through these uses something like a <unk> neural network, which might look like this it's a machine learning model drawn out.

It might go through each of these pixels convert them in numbers and go Oh, Okay. I see this little number here that resembles a car that resembles something that ive seen before these numbers here along the straight line, okay that looks like a road line and I've seen these before and over here. That's a couple of headlights. So we want to avoid that we want to not turn left into those heads.

Launch because that's another car and that would be quite disastrous.

So.

If you were to try and code up all these rules yourself that would take a very very long time, So again, where does machine learning come into play problems with long list of rules. So you might want to use computer vision to go through these and figure out where all the different patents.

Now what happens here, Okay, we might collect some data from the environment and find out scenarios like this are pretty inaccurate. So we might do some testing and then we might go a bit here and find some more parts of the car isn't doing very well the machine learning algorithm doesn't quite know these scenarios because this looks like it's in a tunnel of some sort you might've traded Youll machine.

Learning algorithm all your hard set of rules to work via suburb, but as we said before.

If it works and Youll suburb it won't necessarily work in a tunnel. So this is where machine learning comes into play again and continually changing environments such as a road tunnel I mean, if I was to kind of a self driving car and drive through tunnels that often so I'd probably forget to put in tunnels and then when my self driving counterpunch us a tunnel because it's got a.

Whole bunch of hand coded rules for driving around my Straits It would be.

I'm not sure what to do with this.

So then you might grab these scenarios you might label them that means you've got someone looking at this maybe a human annotator or maybe another machine learning model looking at these scenarios. It doesn't know very well and go okay. I figured out. These scenarios. This is actually a tunnel. These are the tunnel walls. These are quite hard to see but if you look closely there's a true.

Here, there's lots at the top maybe a car a self driving car gets confused with these lots at the top they kind of looked like road lanes. So we'd have to label it to tell how machine learning model remember machine learning model starts with inputs and outputs in our case the outputs of the labels.

Lockout. Chicken example, the inputs may be the ingredients for your grandmother's famous roast chicken dish in our case the ingredients might be the images collected from a self driving car and the outputs maybe to actions that self driving cars should take finally, once you've got those inputs and outputs.

Together, you might try in a machine learning model, which usually looks something like this.

And then deploy it back to the car. So eventually this becomes a loop.

And so you see this is the final thing that machine learning models that really great at is discovering insights within large collections of data. So you can imagine if you're continually doing this loop. This little data engine here, we'll dive back into this in another section of this video.

But if you imagine this is going to start collecting lots and lots of data lots and lots of information and when I say data that Tim is very broad now one kind of data could be photos and videos collected from the eight cameras on a Tesla car. Another one could be all of the transactions in your companies purchase history. Another one could be all of the texel.

Wikipedia, regardless of the data source the principle remains how do we turn this information into numbers, let them muddle find some patterns in it and then US design the software around that this is usually software one pointed out to take the outputs from the machine learning model and translate them into action.

<unk> in a Tesla car and you want to say this and more in depth I encourage you to check out this autonomy diarrhea, it's very very good.

Sorry.

What we're going to cover pretty broadly that was like a long winded bit of an intro into machine learning in general I mean again, you can look machine learning up and you could find out your own stuff, but we've covered enough to understand what we're going to look through in this video.

So number one we're going to look at machine learning problems. Okay. What does a machine learning problem look like how.

How do you diagnose a machine learning problem.

Machine learning process. So once you found the problem what steps might you take to solve it.

Number three machine learning tools now what tools should you use to build just solution. This is growing quite rapidly actually.

And now machine learning mathematics, as we said the computer funds patents in our data and essentially what it's doing de as a whole bunch of math, so what exactly is happening under the hood.

Now machine learning resources. Okay. Now all of this is pretty cool it might be saying how can I learn all of this while the good news is is that a lot of it is available online and you can access it right now.

And how are we going to go through all this.

Well, if we had to cook and a chemist style. If a chemist if you imagine is really really specific going through all of these things.

We're not going to go through it in chemistry, we're going to go through it in cookstown. So if you imagine what a chef downs what Cook does a shift uses their tools such as the controlled use of fire and a knife and then it goes through and sell through them. So that's how we're going to do this we're going to cover these topics broadly we're not going to dive too deep into each of them. If I was to go through these and tell you exactly what each one of them.

That would be doing you know justice instead, what I will continually emphasized doing is actually on the next slide is how to approach this roadmap.

The number one thing to do is explore number two is to comment on this video on the roadmap itself to give feedback what's missing from it because of course machine learning as a broad field that chances are definitely miss something give advice. If you have any share. It. If you want follow your curiosity because there is a lot so don't expect to get it all.

In fact, I actually don't get it all I've put a lot of it here and I have kind of put it here for myself as well so I can come back and research and upgrade mind, knowing and finally, we've got explore again.

They're twice on purpose because that's what I want you to do is explore so in fact, you probably realize that this is not a roadmap essentially telling you where to go it's more a compass, giving you a little non linear gentle push that's little pun, there from new networks as well.

Okay.

You're already.

Okay, Let's guide or do you like that.

Sparkler effect.

Thought that was pretty cool.

Let's go we'll go back.

Let's get out machine learning.

Roadmap reset.

And we've got five major branches here that we're going to go through.

And we've got a little counterpart presentation.

Number one machine learning problems as you can see of kind of create a little sub section in the presentation for each of these little branches here.

So let's get into the first one number one machine learning problems. So if we go to this what is wrong with this picture.

As you can see this is actually a little bit of a famous metaphor here is putting the cart before the horse. So this is probably the number one scale I can give you sort of this whole thing here is to make sure. When it comes to machine learning in general in machine learning problems. It can be tempting to jump straight into machine learning and just go boom, let's just put machine learning in this thing and make it.

Right I can tell you out of my own experience working as a machine learning engineer dealing with a lot of customers and clients that wanted to know about machine learning and see if it could be used this was their problem and including it actually is my problem as well because our machine learning engineering and I look at everything with machine learning as my tool I want to use it for everything but.

Again, this is coming back to that software one point versus two point out machine learning. He's amazing we've seen some of the examples of where it can be used.

But it does require a horse.

So what I mean by that is if you keep trying to apply machine learning to everything it's kind of like putting the horse for the cost. So the idea of this section is to go from having the horse before the card to having it in the correct order.

Now.

If you go to a machine learning problems.

Got categories of learning supervised learning unsupervised learning and transfer learning reinforcement learning.

There's probably some others here, but these are the four main ones, you're probably likely to encounter most often you can break them into your own subset let's go.

Finally jumping to the roadmap.

I am so excited you already let's extend it here, we've got machine learning problems.

And what I've done is I've correlated as I said before the sections here two sections here. So if we come any of this is how I want you to use this roadmap, we got machine learning problems.

To break it out we come up here categories types of learning supervised learning what is that you have data and you have labels and our self driving car example, our data might be images of tunnels images of roads images of paper crossing the road and the labels might be what those things actually are so it's one thing to have just the image. It's another thing to go okay.

I know this is an image of a person crossing the road I know this is an image of a stop sign and so what the model does as it tries to learn the relationship between data and labels.

Yes.

Now cooking example, it's trying to figure out the recipe. So this is some machine learning model funding patents in numbers. For example, you have 10000 photos of cats and dogs 5000 each.

In our case this would be balanced class problem oftentimes in machine learning actually you'll have 9990 photos of cats, but only 10 photos of Doug as you can imagine great.

Great example, would be a self driving car problem, where you've got a dog riding embark maybe you've got a million photos at the person riding embark but only one photo of a dog running a Bakken who knows maybe thats actually an alien so that's what you're going to watch out for but we'll cover that later and labels, which photo can titles, which animal photo one equals dog photo two equals Ken.

Works for numbers too for example, if you have 10000 houses in the selling price you use the information about the houses such as number of bathrooms, such as number of garages, such as number of bedrooms, and try and predict the selling price of the house I.

Wonderful I am not going to read out all of these this is what I want it to be so I kind of just bouncing back and forth and truth be told you could probably tell that havent scripted this I'm kind of just leukemia sat on the floor as well.

Unsupervised learning is data with no labels. So the model tries to find patents in the data without something to reference on for.

For example, you might have 50000 transactions and 49997 of them are similar but three of them are completely outlandish. This kind of problem is called anomaly detection.

Other covenant problems include clustering and dimensionality reduction.

And you've got reinforcement learning.

Actually I could put a link for all of <unk>, maybe I should do that transfer learning is when you take the knowledge from one model and use it in your own for example, take all of that takes from Wikipedia learn the relationships between words and use these underlying relationships to help you build your insurance quite classifier.

Now transfer learning is a very valuable skill.

So we got here transfer learning because often times in practice training a machine learning model. So say you're building a self driving car for example from that Tesla autonomy day video that we talked about that took 70000 hours to train on a GPU cluster or GPU hours now you might not have access to that but the beautiful thing about transfer learning is that.

You can take what another machine learning model has AK that patents a machine learning model has learned on a particular dataset adjust it to your own.

And then use it for your own problem.

So if we come back to the Powerpoint.

Now we've got machine learning problems some problem domains.

Now some of the main ones, we've kind of hinted on them just before a classification regression clustering and dimensionality reduction again, you could divert days into more things, but chances are if you figure out what actually this it could be sequenced to sequence here, but I haven't covered that you <unk>.

I would argue that sequence to sequence is actually a former classification will get into that in a second lockdown of jumping ahead of ourselves here Daniel come on.

So classification would be like does someone have heart disease or not based on their medical records regression would be predicting a number trying to predict the sale price of a house. If you were in the real estate business you might be wondering what should I list. This house price falls. So you could build a machine learning model to go through all of the previous sales in your area.

And figure out what the most ideal prices for Youll certain house based off the sale price of all of the other houses. So clustering you might want to find out what different groups. There are as we said before so you had a whole series of transactions or a whole series of people listening to your songs say for example, your Spotify and you have.

1 million people listening to different songs, what kind of songs that are listening too. So maybe that people. In this group are really into folk music that people in this group really into rap music and people in this group are into pop music and the people here are really into rock.

Rock in Rio.

And then finally, there's another one called dimensionality reduction.

Which is there's a little thing in machine learning called the curse of dimensionality, which is when you have so much data that our model can't really even find any patents in it because there's just too much of it. So what dimensionality reduction tries to do is reduce that amount of data you have to only really pull out the most important thing so it might get rid of.

The white column. This is a table of numbers by the way remember a machine learning algorithm funds patents in numbers. So if we come back we got out classification problem, we might start out by doing some dimensionality reduction because refining a machine learning model can't really figure out if a person is heart disease or not.

It might go through this is a simplified table of medical Records and go you know what the only real two things that we need to know if someone is heart disease or not again. This is just a made up example.

Is that hot right and their age.

Writes too high, but I, probably got heart disease, and so we actually don't need their white again.

Made up example, just to simplify it and put it on one slide and so.

We come back to our roadmap we've got some example problems.

And before we even get to that we've got classification. So let's try clicking on this link.

Machine learning mastery. This is a great blog, you should definitely check it out that's kind of what I've done actually in this road map I've put all of my favorite resources for different things.

<unk>.

Really great resources for different topics. So you should definitely definitely definitely and check out the links if you want more information I've kind of given a little blurb about a lot of the things that deserve a little blood, but if they need an expansion. This is where the links coming so classification.

We got a binary classification is this email that I'm receiving to my inbox My email inbox is this spam or not again imagine trying to code up all the rules for deciding whether an email spam or not you would probably use machine learning for that is this photo of a cat or a dog you could probably cut in.

<unk> looks like a certain thing a dog's face looks like I said in thing, but I'm too lazy to do that.

One of the reasons are lot to getting into machine learning is at the computer because these things out for Ya multi class classification.

That was binary.

It's one thing or not.

Multi class is when you have multiple different options of what a specific thing is if in the case of traffic lights is at crane that would be class one or cost zero. If you started from zero yes.

Yellow that would be class, one or red what breed of dogs in this Friday so.

You might have a 120 different Doug grades and you want to try and build a machine learning model to classify what topic, Doug It is in a certain photo.

We actually did that on.

Teach the machine learning caused by the way. So this is a little bit of a plug for that but obviously you don't have to do it I just want to show you. An example of what a project in machine learning looks like.

We're coming in loaded this up so we use transfer learning remember what transfer learning is.

Kim Avia, taking the knowledge from one model and use it in your own that's what we did for this project.

Returns from Doug photos into numbers tenses, we pick the model from Tensorflow hub, we'll see what that is in a moment and another section in this video we fit the model. So that meant fitting is basically the same as high muddle find the relationship between these dog photos and the labels, we evaluate let's see if it was good or not and we improved it through experimentation.

And saved and reload to try and model.

So we came here. This is the kind of thing it looks like all of this is machine learning code.

Now I'll come back.

There is also a multi label classification is what items such as photo contain what topics is this Youtube video about so if we were to build a multi level classification machine learning model on this video that you are watching right now what would you like the machine learning model to output.

Is it a multi liable classification could you put multiple topics is this video about machine learning on its own or is it about machine learning problems or is it about machine learning resources it could actually be that all of them. So therefore, its multi label now if we come back to classification remember classification as a top of machine learning.

And then as example problems and then we've got our valuation metrics.

In other words, how do we know how well our machine learning model has learned different patents for different problem sets. So then we can evaluate it with a confusion matrix and might be wondering what's a confusion matrix I'm a little bit confused about a confusion matrix.

Well, if we have a look at this beautiful little income is great guide from data screwed on Io. Another amazing resource. This is a simple gone to confusion matrix terminology okay.

Well, if we read through this okay. This is a confusion matrix here. So the number is 165 the number of total things arise and so the actual no actual yes predicted so a confusion matrix.

Compares the actual labels from our machine learning problem to the predicted so what a machine learning model predicted so ideally in a confusion matrix I won't go through it too much here is that on the diagonal you will have all the numbers.

To their maximum capabilities and on these sides here, you'll have zero in <unk>.

To know when the actual Louisiana five times. So you would say it's confused on these two.

Two sets of examples.

In other words, it's a comparison between the true positives true negatives false positives and false negatives, but.

I digress as I said, we're just going to touch on these different things rather than me go through each of them.

It's much better for you to follow your own curiosity and figure out what you should do so if you are a classification problem and you build a machine learning model you should be looking at these evaluation metrics to tell how well your model is doing.

So if we come down to regression another top of machine learning program come.

Come back to our slide remember regression is if we're trying to predict say a number.

The ideal sale price for our house.

To come back.

Example, problems given the number of bedrooms number of bathrooms in house location predict the sale price of a given half or how much will bitcoin be worth tomorrow. So if we look up bitcoin price.

Now you see what we've got here.

All of this is actually information. This is data that machine learning model may be able to find patents in.

And now I'll give you a little bit of a warning for prediction, it's actually very hard there's actually a lot of companies that do dedicate a lot of time and money to figuring out what the price of bitcoin is and what the price of stocks are.

But this is just another example of where you can go Okay. You might look at this and go Paypal deal won't drive the coin price I am if you were to base your entire prediction.

Of bitcoins price tomorrow by this headline what would you think so Paypal deal won't drive bitcoin price higher.

You probably want to read the article for more but if I was too.

Is it just off this one heading of the article I would say bitcoin price will either not change.

In fact, maybe go down a little bit based off this headline.

And now again I may be completely wrong, but this is the type of thing that a machine learning model could do it could taking all of these headlines it could even read the article turn it into numbers of some sort and then look at the history.

Of bitcoin price something like this and say, okay well on June 29th It was 13348 shy in dollars and these were the news headlines on that day and.

And so if we come back so if we go one week on June 25, it was $13552 and the news headlines were saying that bitcoin was going to go up but actually went down so you might be able to figure out. These patents yourself, but again chances are could you code all the rules yourself I know.

I, probably couldnt. So that's why you would want to probably use machine learning we come back in to evaluate how your machine learning model is going there and our regression problem you usually want to use R. Squared on main squared error or mean absolute error.

Mean absolute error is all areas are on the same scale AJ if trying to predict 100 predicting 99 is the same ore is predicting 101, whereas squared error. It makes the outlier standout more that means if being 10% off is more than twice as bad as being 5% off you probably want.

Pay more attention to mean squared error again, we've got some links here.

Well.

We're blistering through this that way now another problem that we haven't mentioned here on the machine learning problems is sequence to sequence.

So if we got here sequence to sequence is taking a sequence of something and turning it into a sequence of something else.

In other words, given a sequence of English tax translated into French.

We go here Google translate.

This is a sequence to sequence problem.

Because when we go here.

I Love.

Machine learning so much.

We want to translate it to let's say Spanish I'm, all central L. L trends is that they are the medical.

Now if you could to the extent as you can probably correct my pronunciation or say, whether this is actually a correct translation, but this is an example of a sequence to sequence problem using machine learning again, if you went through the entire English language you could figure out the rules of what it translates into Spanish.

Or you could try to at least but in my case I haven't got time to go through all of the rules from English to Spanish So I'm more than happy from a machine learning model to look at let's say all of English Wikipedia and all of Spanish Wikipedia and see where they line up.

And then figure out the rules and use of patents and those numbers. So turn all the English into numbers turnover Spanish into numbers.

Let a machine learning model figured out the patents on more than happy to use those patents.

So let's come back.

We've got clustering, we've talked about that a little bit before then we've also got dimensionality reduction and some common techniques for doing the eventuality reduction is PCI and there's also representation learning or feature learning.

And again I'll put some notes in here, where I'm little bit torn so for this one representation learning may be better off on its own branch.

Better off maybe better off better off on its own branch.

Wonderful okay.

So we've gone through a bunch of different machine learning problems.

<unk>, where coax not chemists at the moment. So we've got if we come back supervised learning unsupervised learning reinforcement learning transfer learning.

Classification regression sequence to sequence clustering dimensionality reduction there as some of our main machine learning problems now.

If we come back we've done one branch this beautiful, let's come back to a little keynote here what are we up to next machine learning process.

Let's dive in.

Now this is probably the biggest part of the entire roadmap.

You can probably guess why because the process of doing machine learning is relatively.

Let's just have a look we got here steps and our machine learning project 172.

Well when we look at that.

Resume ran out.

Here we go.

We've broken it all down.

Now I want to warn you again is that.

We're going through this as Cook's not chemists. So this might not be exactly. This is just what <unk> built out of my own experience. What I found is most helpful.

It breaks down into a series of sub topics. We've got data collection. So if you want to find patterns and data how do you collect that data in the first place and.

And you've got data preparation to remember as we said machine learning model works on finding patents in numbers. So how do you turn that data that you collect into numbers.

Then we've got a ride down here choose.

Choosing an algorithm.

Or actually before that we've got trained muddle on data, which can be broken down into three steps. If you think about this.

Choose an algorithm or once you've had a little bit of experience. If youre beginning you probably never heard of these steps, but usually it goes choose a certain type of machine learning algorithm, usually the algorithm is based off what machine learning problem youre facing so theres certain algorithms for certain problems.

Over fit the model.

Overfeeding means the model has led to patents, so well that it's doing too well.

And then reduce over fitting with regulation, which is a technique to holiday our models potential for learning patents and data to hold it back so that it can adjust itself to new data.

And then if we come down here.

Whoa analysis slash evaluation.

Serve model. So once you've trained a machine learning model. It's one thing to try it on your local computer. It's another thing to have that deployed in production to say a million self driving cars and then finally ray trying to model when necessary. So that's what we're going to go through.

And again, we're not going to cover all layers, because that'll take way too long, but we're just going to bounce back and forth.

Going back to the presentation.

If we have a look at theres, some kind of made it into a little colorful flow chart and say that's what we want to do is like a multi angle attack.

Remember, how I said I break this down into like six different subtopics well, that's all in one little beautiful colorful chart here. We've got data collection will ask questions like what data exists where can you get it is the data public the privacy concerns is it structured or unstructured and what I mean by structured and unstructured is imagine structured.

Is like an excel spreadsheet, you have got rows and columns unstructured could be like a photo or video or natural language texts like this it has some kind of structure I mean, if it didn't have any structural machine learning model wouldn't better find out patents in it but it's not rows and columns type structure data.

Preparation here you might be steps like exploratory data analysis or data preprocessing remember turning data into numbers data splitting turning your data into a training validation and test set we'll get into that in a moment trying to model choose an algorithm over fit than regular <unk>.

Tune the hyper parameters.

Once you've trying to muddle analyze and model once you've analyzed the model and go yes. My model is ready you can serve it.

I put it into production so that people can actually use it and then finally retrain our model and if your data is continually changing as you imagine a self driving car problem.

We're a road has been updated due to roadworks are you all predictions still valid on that New road and again here. It's a continuous process. That's why we've got a big loop here goes back and forth in traditional programming you might just have your code and you have to make sure that your code is up today, whereas in machine learning you've got.

Code, you've got data and you have to make sure that they're both up to date, otherwise youll be predicting the wrong thing.

So if we come back.

Let's go through our first step here.

Data collection, so you've got some questions to ask here.

Data sources exist already now we're going to have an example of what data sources Sara.

Come down a machine learning resources, but we're not up to that yet.

If you want to skip ahead, you can definitely skip ahead, what privacy concerns are there. So if you are dealing with health data or medical data, it's probably a wise idea to make sure that no personal information can be late is the data public. So in the case of Tesla, we cant necessarily access.

All of their machine learning data once you've formulated your machine learning problem I can I thought of whether its a classification task of regression task one of your first steps is to start.

Collecting data on that problem to figure out if a machine learning model can find patents and it will.

What kind of problem are we trying to solve well.

Actually this should probably be number one David guy that can be in some relative order where should you store. The data do you store it on your laptop or these stored in the cloud.

And then you've got different types of data as we said before structured data data, which appears and tabulated format rows and columns stall like what you'd find in an excel spreadsheet.

Now even within the structured data you have different types of data within that nominal slash categorical data. So is something one thing or another are they mutually exclusive from each other for example in car sales color is a category.

Okay may be blue, but not wide and the order does not matter then you've got numerical any continuous value <unk>. The difference between them matters. For example, when selling houses $107950. That's actually probably a relatively cheap asked depending on where you are living in the world I mean if.

That was brisbin, not probably snap that has not been a heartbeat unless it's a bit of a dump.

Is more than 56400, then you've got ordinal data data with order then you've got time series data just like that Bitcoin example.

This is a time series data because it goes.

Over a time period, one week one month there we go so thats over a time period.

I'll come back and they've got another type of data, which we touched on before unstructured data. So data with no rigid structure images video natural language texts speech, what I'm, saying to you right now so if I wanted to build a speech to text system, how could it look at the sound waves that are coming out of my mouth and figure out.

What that sound wave translates to in text now I want to give you a little quiz as we're watching this if I wanted to do that problem ton mined voice my sequence of words visit a little hint for Ya sequence of words into text, what kind of problem would that be.

Well, if we come back.

All the way back down to our machine learning problems. This is we're exploring is so valuable that would be a sequence to sequence problem.

Build a device, which responds to speech commands.

Not going to say that because my smart speaker is going to go off speech goes in as a sequence of vibrations in the air what I'm, making now with my voice and it gets converted to text and interpreted as a command. That's a sequence to sequence program same thing with playing the voice back. It takes a sequence and then place the voice back in a sequence of San Juan as well.

Just really confused myself there with them.

With my words, I sequenced seven tons, but you get what I mean, you take one sequence of words sound way generate that into text and then to the speaker to play something back it will take the text.

Do some sort of command and then reply back with the voice.

So if we go here close that one off now we come back now.

Now we've looked at data collection, let's have a concrete example, you know because it always just want to be jumping through all these little tentacles here, we come back to our Tesla data collection and model training Little data engine here and how might this fit in.

So this machine learning process, we've got data collection preparation, China model analysis evaluation serve model retrained.

This fit in.

So if we put that back on the screen.

So let's have a look at data collection.

Have a data source.

And this is jumping ahead, but that is okay. We have a data source. So the cause as I said diesel have cameras on them. So there'll be collecting data from the environment, taking photos et cetera, et cetera, that's going to be stored.

Tesla service somewhere so this is a data collection they have a whole fleet of cars actually a beautiful probably one of the best examples that youll find in the real World is Tesla as data engine facing here, so lets say data collection.

Now data preparation, we haven't touched on that yet so let's do that.

Whoa, we come here, we might close this one also we know we've hit that up. So this is data preparation and so one of the first steps you might do in data preparation is little thing called exploratory data analysis learning about the data you are working with and if we click this little link here.

What do we come to <unk>.

A general introduction to exploratory data analysis by yours truly.

So this is an 18 minutes article on it if you wanted to find out a little bit more goes through an example problem.

But let's just talk about what we've got here on the roadmap.

So questions you might ask in an exploratory data analysis basically what you are trying to do here is <unk>.

Before Youll machine learning algorithm goes through your data.

In the case of Tesla before our machine learning algorithm, even looks at this now again, if you are building a self driving car you might have some experience driving costs you kind of know what the data that it's collecting but you don't know how it's going to look when it's being collected from a camera here from a camera here from a camera here and again I'm, making this up maybe they have.

In different positions.

Things, you'll be asking what youre trying to do is just look at the information that you have and kind of build your own machine learning modeling your head before you even build one that uses math.

What are the feature variables what are the inputs and what are the target variables. The outputs in our case of cooking, our favorite Sicilian grandmother, Roche chicken recipe the feature variables the inputs might be the organic rose chicken or the organic chicken from IV in wood farm up the road and the vegetables source from Blue Mountain.

And all that sort of stuff or really you can just go buy it from the supermarket the target variables. So the output might be that beautiful delicious dish or in the Tesla capital then there might be not to drive head on into the wall.

For example, if you were predicting heart disease feature variables may be a person's age weight average heart rate and their level of physical activity and the target variable will be whether or not they have heart disease.

Again this has been very simplified, but what kind of problem would that be if we were to take someone's health information.

<unk>.

Try and figure out whether they had heart disease or not.

As a classification.

Or is it regression trying to predict the number.

If you guess classification 10 points, because that's one thing or another.

You might also ask what kind of data do you have structured or unstructured categorical a numerical you might create a data dictionary for what each feature is for example, if you have a column of numbers cold H are now is that ours.

How would someone else know that it actually means heart rate. So these are the things you might line up let's go to <unk>.

<unk> is another beautiful website for data sources and actually you can learn a lot from Diego you can look at problems. This is Titanic data.

This is what a data dictionary might look like Erica variable. So we've got these inputs here survival P class someone's sex their age what now what is <unk> sp.

I don't know whats the best peers, but if we come to look the definition is the number of siblings spouses aboard the Titanic.

Now I'm starting to understand what kind of data I have.

So if we come back are they missing values should you remove them or fill them with feature amputation see below we'll have a look at that in the second feature amputation is basically just a way of filling missing values.

What are the outliers, how many of them out there.

The app by much three plus standard deviations now this is just a little heuristic you don't have to use that an outlier will really depend on the data that youre working with one way to find out lives has deployed a histogram plot of distribution of your data if you're not sure what that is will touch on that in a minute.

Why are the outliers there other questions you could ask a domain expert about the data for example would have heart disease physician remember I'll problem trying to predict heart disease with a heart disease position be able to shed some light on your heart disease dataset more than likely yes.

So.

When we come back data preparation, we're still going that was only one step exploratory data analysis. The main idea of that is to become one with your data to build your own machine learning model that data preprocessing preparing your data to be modeled.

Well.

Now we've got a few steps here.

I'm, probably it's probably wise because we did cover Adi even let's just jump to the major topics feature amputation feeling missing values.

What's important to remember that a machine learning model is that for the time being if you're working with structured data pretend that machine learning model can't learn on data that isn't there. There is sub topics about machine learning models adjusting to different data sources and that sort of stuff, we're not covering that we're covering the stuff that youre going to come across most often.

Machine learning model can't loan on data that isn't there. So what you would want to do is impute that data. So fill it up so if you want to have a look at this article.

Great article here six different ways to compensate for missing values in a dataset.

And then you might want to do feature in coding. So once you fill Joe missing values and a dataset.

You've typically do this with structured data by the way feature encoding is turning your dataset into numbers now machine learning model requires all values to be an American no matter, what youre trying to model, whether it's tax whether it sound waves, whether its images it has to be in some form of numerical form then you've got <unk>.

Coding and you've got labeling coda embedding coding.

Now these are just different types of ways of turning your data into numbers.

Then you might have feature normalization scaling <unk> standardization. So when your numerical variables are on different scales. For example, the number of bathrooms is between.

Why don't we put is here is between one and five and the size of land.

And size of Oh, maybe we can.

Maybe I'll put this so people know that it is a variable in size.

Of land.

Again theres not scripted this is just us exploring this is between 5020 thousand square feet. Some machine learning algorithms don't perform very well see how this is between one and five and other variable size of land is between 5020 thousand. So the scaling is all off what normalization.

Will do is put your variables between zero and one.

Scaling and standardization helped to fix this and there's a little bit more in depth of what each of those are.

Feature engineering. So this is where you input your own knowledge into your data. So you might have a collection of data where you just collected raw form from the environment and now feature engineering is where you take your domain knowledge and and code that into the data.

So transform the data into potentially more meaningful representation by adding domain knowledge, if we click this little article here.

Boom discover feature engineering had to engineered features and how to get good at it. So I'll, let you read through that one Oh there is a good one theyre feature engineering is an odd. So there is a lot of these topics and this is what what's quite hard about machine learning actually is that a lot of these things that we're covering HAMP.

Aren't necessarily exact sciences so.

Again, it's why I put it in the slides twice before the biggest thing that I can impart on U is a willingness to explore to try and figure out to try different experiments and see what works and what doesn't and feature engineering, you've got different things like decomposing turning a date such as you mentioned you've got the state here.

It looks like that 2020, <unk> hundred <unk> at this time, you could decompose that into the hour of the day. The day of the week what day of week was that what day of month was that is it a holiday so if youre looking at the bitcoin price.

And you're wondering why is it so low there.

Okay. That's $12963 on June 15th May be June 15th was some special holiday or maybe something happened on that day that you could and code into your dataset based off this time stamp here now of course I'm, making this up June 15th May have no significance whatsoever, but.

These are the type of things that you can use your knowledge to enrich your dataset.

Then you've got <unk>.

<unk> I don't have to really think about how to pronounce that turning larger groups into smaller groups because some examples there.

<unk> and interaction features.

So combining two or more patients indicator features so using other parts of the data to indicate something potentially significant.

Now by the way I've kind of put these in relative order that you were taking a project, but again because it's so experimental machine learning is a lot of balance just jumping between different things so <unk>.

<unk> has enough read through these things and just see if any of them relate to what youre working on or just you want to learn more on general hit the link.

The patient selection.

So selecting the most valuable features of your dataset to muddle remember before when we remove those two features from trying to predict heart disease. The curse of dimensionality, let's have a look at that actually curse of dimensionality.

Is this thing on assuming the curse of dimensionality refers to the various phenomena that arise when analyzing and organizing data in high dimensional spaces.

That do not occur so basically high dimensional spaces, just means lots of different data like lots of different features that do not occur in low dimensional settings, such as the three dimensional physical space of everyday experience and that's another thing that youll see.

And with machine learning because it's a lot of math in there which is.

Don't get me wrong, I'm, not taking away the important machine learning doesn't exist for that math put it that way, but youll see a lot of things that can be quite daunting. If you havent covered math in a while so youll see great symbols Youll see terms like high dimensionality space, but the real best way to go about these things is just three practice don't work.

Sorry, if you don't get it the first go work through problems work through projects and read different things from different people different sources.

Tom you will start to understand it more.

So with feature selection you could do dimensionality reduction feature importance you typically do that after modeling rapid methods such as genetic algorithms flow that's worthy.

<unk> Daniel T part does this.

I Love T part.

There we go.

Some cool code running there I'll, let you check that out.

Dealing with imbalances to remember we talked about this before we briefly touched on it does your data have 10000 examples of one class, but only 100 examples of another now what's a good example for this well here's some ways that you can deal with that but a good example of this might be in the case of Twitter. If you have two.

<unk> you might have 10000 of them that are good people being nice because generally people are pretty nice, but you might have 100 examples of people just being toxic idiots and youre trying to build a Twitter classifier to go while this tweet is noise or these 100 examples are actually pretty harmful like these are people being really.

Maine to someone else and that's potentially harmful.

You might not have many examples of these because hopefully.

Pretty good.

Now that's data pre processing in a nutshell.

So if we come back where are we in their presentation data preparation now in the case of Tesla a preparation step might be trying to source different data sources that you don't have very much of so let's say you've got a stop sign of taking this from the Tesla Autotomy dive video by the way there was an example in their <unk>.

Got some stop signs there.

Excluded by trees.

So imagine this is a stop sign.

If youre, a Tesla car and you say that what should you do.

So one of Tesla data preprocessing steps would be to collect more images. Now this is through analysis and evaluation remember this is a continuous process. Maybe they are trying to model found that their analysis on stop signs was not very good on a certain number of scenarios. The scenarios. So they might want to collect more data for that type of scenario.

And then try and model to be better on that scenario remember this is a continuous process. So let's go back.

Finally in data preparation, we've got data splitting.

Now this involves splitting your data into three sets the training set which is usually 70% to 80% of the data. So the model loans on this the validation set typically 10% to 15% of the data the model hybrid parameters have changed on this now if we imagine back to our cooking example, at the start hive of parameters.

The settings on your model that you can use to improve it such as the settings on your oven when you're cooking Youll Grandma's say Ms. So Ian Rose chicken dish, if you could on 200 degrees.

But if you look at on a 190.

You'll get that perfect dish.

And then finally, the test set which is usually 10% to 15% of the data you want to keep this separate so this is like the training set is like if you are studying for an exam. The training set is your course materials. The validation set is your practice exam that your professors at Schuh and then the test that is the final exam. So your mom.

<unk> final performance is evaluated about Mesa and if you've done it rot hopefully the results on the tests that give a good indication of how the model should perform in the real world.

Do not use this dataset to train the model I cant stress that enough Cape they separate from each other.

Well.

So lets data preparation.

Let's go here now.

We've got as I had already how do we train the model remember the three steps choose an algorithm over fit the model.

So meaning that it's actually launched two well it's learned the patents two will reduce the over fitting with Regulus Asian again begin into machine learning, you'll look at something like <unk>, you'll be like I've never heard of that word my life well. That's okay. We're going to go through what it is.

Or another great trick is you just go.

How to use it.

This is a search bar.

And what is <unk>.

You'd be amazed at how far youll get by just reading the top three lengths and Dana just read them don't even bought like if you don't understand some terms just read them.

And then it'll sinking to you Brian over the space with many months and then you can come back right again, the second time, you read it it actually makes it a little bit more sense anyway.

Step one choosing an algorithm.

We do this well give you one great example.

Choosing the right estimate you might look at basically out woah.

That chart has a lot of stuff on it but.

This is probably something you might get a little bit familiar with when you're starting to align machine learning. This is socket learn.

Oh, let's take a little step back here at the moment you might be asking is okay.

Hang a whole bunch of stuff about machine learning algorithms do I have to know how to <unk> well, let me tell you how machine learning engineered like myself or like many others do work in practice, they even in the Tesla scenario.

This model here.

Chances are in Tesla's case, they probably cutting it from scratch, but if you watch this video theyre actually using some form of transfer learning and it's not necessarily taking a model from another dime nine because their model is actually very specific to their cars and.

<unk> has created a style of model in fact, it's called a raise net model. So if we go here.

<unk> net.

Typical naming.

Residual neural network, so tesla's models actually use RESNET under the Hood, but then they just apply.

So say for example, you might have this part if someone's already built that you can access that part right now but for Tesla is specific problem. They may build based parts with code.

So if youll starting out on your own problems chances are the majority of this has been built for you you just have to learn how to diagnose a certain problem.

Data ready to fit through this model and the outputs lined up that's a big part of machine learning right there.

So we'll come back to this choosing the right estimator in socket line, which is a machine learning Python Library machine learning model is often referred to as estimate out. So if we go through this let's say for our heart disease classification program. We have start we don't have let's say 300 patients. So we have yes above 50 samples predicting a category. So we are doing.

Mission, So that's heart disease or not do you have labeled data, yes, we have under 100 K samples. So we might go a linear SBC. All you can actually try each of these the green boxes, our machine learning algorithms as we go here and.

And <unk> classifier my favorite is the random forest forest of randomized trains there we go.

Now if you've never coda before you might look at this and go whoa, what's going on here, but let me just talk you through this.

You have got from socket learned ensemble, so bucket loans on samba models import random forest classifier. So this is a machine learning model here in code already written for us.

Then we're going to define our inputs, which is usually used as X, which is this series of numbers here.

And we're going to define our labels, which is usually defined as why so this is inputs and their ideal outputs that we want.

Then we're going to instantiate al machine learning model, which is CLO equity shop reclassify, what we want to do here is our random forest classify to work out the patents between zero zero. This is associated with this number here because these brackets. This is one example, and this is one example here.

This is another example, and this is another example here so clearly the patent news here is that this list of zeros. The output we wanted zero and if Theres list of ones. The output. We want is one very simple, but that's how the understanding starts it starts from the bottom and then we work it up to more complex scenarios like designing a self driving car.

So then we go here.

We want to end estimators, so a random forest classifier is actually a combination of multiple different models. So we want 10 different models.

CLS and then remember what I said before fit is actually part of Hi machine learning model find the patents between this and this is what we're doing that's.

Machine learning card Radek.

Brian how cool is that so if you understand we'll go through some learning pathways in a minute, but socket loan chances are if you're using machine learning, you're probably going to come across socket loan, it's actually a beautiful design machine learning library.

Just one little part of it will come back here.

So we chose an algorithm now remember what I said choosing an algorithm actually depends on what problem you're trying to solve so we've got a series of supervised learning algorithms. So that means you have data you have labels, you've got linear regression I've got a link for each of these logistic regression K nearest neighbors support vector machines.

Decision trees in random forest, we just saw one of those.

Now.

Just knowing this doesn't necessarily mean that you know how the random forest classifier works do you need to know how the model works under the Hood.

Now I would say to begin with don't bother learning how each of these work under the Hood.

Instead.

Practice applying each one of them.

In an experimentation fashion and then when your curiosity get spot on a certain type of one because think about this that diagram, we used before.

Its trying to put the cart before the horse.

See if these algorithms work.

Or see if you can apply them in some way and then when you've got a little bit of momentum going and you want to figure out how do I improve them.

Or how do I understand this small so I know, what's going on behind the scenes than start to dive deeper.

So David go neural networks. This is actually so as you see here as you think that machine learning is just the neural networks that you're seeing or the artificial intelligence also called deep learning, it's actually a lot of different machine learning algorithms.

And with a little practice youll start to get better at assigning which one for certain problems.

It takes a lot of experimentation to narrow networks can be used for a whole bunch of different things classification regression. It takes a series of inputs manipulate the inputs with linear ooh fancy word dot products between weitzen inputs and non linear functions activation function.

What's that.

That's why you do some of your research you search out like that.

<unk> is an activation.

Function.

Activation function.

You'd be surprised again Wikipedia, probably full of math look of that.

But as I said start by reading, if you're curious about something just raised one post end to end.

And then start to dive deeper when you need to.

So we've got other algorithms such as gradient boosting machines, we've got links to those different types of neural networks Convolutional neural networks typically used for computer vision, which is something like this.

<unk> No network is actually a Max pooling layer there were not diving too deep into what they are this is a high level overview and then you've got other algorithms for unsupervised Edwards actually this is what we're missing we don't have any reinforcement learning algorithms.

In practice.

I haven't come across using reinforcement learning algorithms outside of like a research domain and what I mean by that the idea of research is sort of of course to push forward the knowledge of the space, but it's not always necessary to put something in a practical use case.

Such as in Tesla's case their whole game is machine learning on a scale that they can use in their self driving cars.

But anyway, we have unsupervised algorithms, such as clustering visualization and dimensionality reduction anomaly detection.

So once you've chosen an algorithm.

There again.

This is a little bit of Tidbit every learning algorithm has a loss function and optimization criterion and upsize Asian routine.

Once you've chosen an algorithm.

Type of learning.

Now.

How does the algorithm actually learn on your dataset, we've got batch learning. So in other words all of your data exists in a big static warehouse and you're trying to model on it.

You've got online learning your data is constantly being updated and you're constantly trying new models on it each learning step is usually fast and cheap opposed to batch lending runs in production in loans continuously.

So, let's think about how Tesla might use batch learning online learning.

Now for online learning they might have a scenario, where they need to quickly update a certain thing and the cost so they've retrained a smaller part of our model and deploy that to it and so what I mean by that is and I'll stop sign example, they might have found that our cars are performing really bad on stop signs like this.

We need to upgrade our dataset so that al machine learning model understands that this is actually a stop sign and this is actually a stop sign and this is actually a stop sign.

So they might want to use online learning for that type of scenario, whereas batch learning is maybe they've got a new style of car in a prototype and it Scott instead of the eight cameras. It's got 25 camera. So they may take their entire data set of all the information from the eight camera cause combine it with their.

25 camera cars and trying to completely new machine learning model and one big hit on all of the data that they have.

Now again, I'm, making that example up but that's just how you can imagine these things batch learning typically everything happens in one big go online learning little by little in a constantly changing environment.

Transfer learning. This is so important I want you to research this transfer learning.

Take the knowledge of one model, what its loan and use it with your own.

Transplanting gives you the ability to leverage <unk> by the way youre going to see the acronyms soda a loss in machine learning research. It's just made state of the art. So the results Youll machine learning model are getting a state of the art on a sudden problem. This.

This is helpful. If you don't have much data avast compute resources. So use the following resources for different transfer learning models Tensorflow hub pod to a child hugging face transformers to take John too to name a few active learning. So that's where you get a machine learning model to figure out some things are its own and you're also correct. It with.

Human in the loop. So for example in a Tesla car scenario, our human in the loop, maybe finding the scenarios, where the car doesn't perform very well.

Collecting more data on that and then putting it back into the machine learning model.

Just like that stops on we talked about on sampling not really a form of learning it's more combining different algorithms together a random forest is an example of an sambol machine learning algorithm, so that kind of means you're leveraging the wisdom of the crowd. So if you asked one person.

Hey, what's the best direction to take left or right.

I might say right, but then if you ask not on the people they might say left so should you trust the non people or should you trust the one person.

Now.

If we come here, we've got underwriting so under fitting happens when you model it doesn't perform as well as you'd like on your data.

So that means that basically the model hasnt learn as much as your evaluation metric would lock it two and remember if we come back up here.

Go to our problems.

We've got a series of evaluation metrics say, we wanted to train a classification model to 90, 999% accurate.

<unk>.

Our model is actually under fitting.

Under fitting.

And it's getting only 92% accuracy.

In the case.

All of a Tesla car you need almost a 100% accuracy for detecting pedestrians because if it fails to detect a pedestrian well then that's not going to be very good for that pedestrian isn't so.

That's an example of underwriting over fitting.

Happens when Youre validation loss, how you model is performing on the validation dataset lower is better starts to increase.

Or if you don't have a validation stent happens when the model performed far better on the training set and on the test set now over fitting is usually say for example, when you're studying for an exam.

And you've learned the course materials too much and then it comes time to the final exam and all you know is hard to reproduce the course materials. So youll skill set can't generalize that's machine learning time, there as well generalize to a new set of problems. So this in a machine learning.

It typically happens when you're a valuation metric is far higher on the training set.

In other words, the course materials than on the test set so the test that is meant to mimic how your model might perform in a deployed scenario.

You can fix this through the various rig utilization techniques.

We go here.

Alwan lasso, an L to rage realization dropout, so drop out it's actually really convenient it basically says hey, let's remove just random parts of our model so that the rest of it becomes better.

[laughter].

Pretty cooler early stopping so stop your model before the training of validation loss starts to increase much more data augmentation rationalization again, these things I'm going to be saying them to you, but I'm kind of just tying them together with what we're doing here, we're not as I said before we're not really diving into them if youre looking at.

Isn't going Holy Crap, there is a lot here and there is what.

What's important is that as you start to have practice and working with different machine learning problems. You will start to go Okay, I know where dropout fixing now I know where data augmentation 15, I know rationalization fits in.

All of this comes with practice, what we're doing now is just tying things together.

And finally once you're trying to model you might do some hyper parameter tuning. So ran a bunch of experiments with different models settings, and see which works best So we've got to.

Components with your machine learning models remember when I said, we have the cooking experiment or the cooking example, and to get the perfect roast chicken dish.

We might have to preheat the oven to 200 that would be one setting we might also have to tenant on fan force. So those would be two settings that we have to choose so with a machine learning model you typically have parameters such as a learning right, which is often the most important hyper parameter there are different ways that you can set this setting such as learning right scheduling.

<unk> or the cyclic learning right. There's also a different thing such as that.

Batch size momentum white decay number of layers.

We've already said batch size number of trees number of iterations and many more a tip fee.

Tuning hopper parameters.

As if you're wondering between hopper owners try just searching for the algorithm names such as random forest hyper parameter tuning.

So if we look at this paper a disciplined approach to neural network hyper parameters.

Let's click this one.

So this is the kind of thing the resources that you've come across this is archived by the way where a lot of machine learning research gets published when you first interact with this youre going to be like low <unk>.

A lot of complex stuff going on here, but again with practice youll start to see the value in it youll start to understand it more youll start to see how you can use it in your own work. This is taking a little while to load isn't it.

Our hydro parameter may be so this is a machine learning model, let's say each of these is a model in itself each of these little boxes.

These are often referred to as layers in a machine learning model now.

Now if we wanted to improve this model what we might do is actually another three of these boxes over here. So.

So then we've got a title of what this 123456789, maybe we add another three boxes over here to find more patents and data and then we've got 12 boxes. So now a machine learning model is actually comprised of 12 smaller models.

Remember on some bill learning now Thats using ensemble learning here is actually the wrong way to describe it but.

In terms of improving our model one way might be to increase the number of layers here. So increase the complexity of the model another way might be to decrease these so to cut this off here.

And we.

And the six layers. So that is a form of hybrid parameter tuning.

Any setting the H engine, our model by hand is a form of hyper parameter tuning.

So we come here. This is what a research paper looks like.

Pretty cool high.

So a disciplined approach to neural network parameters. So if you read through this.

Although deep learning has produced designated success for applications of image based video processing was at say here most training.

With sub optimal hyper parameter see requiring unnecessarily long training times. So in the case of cooking, our chicken and an optimized hyper parameter setting might be an oven on 100 degrees. So thats going to take far too long for our chicken to cook. So the ideal setting a machine learning algorithm might have baidu.

<unk> 100 degrees as it setting we might actually need 200, so that's what hyper parameter tuning of all of us finding the right settings for your model.

Now I will give you a little tidbit is well usually with a machine learning model hybrid parameter tuning doesn't play as much as a part as data collection and data preparation.

Those two alone will generally if you have good data collection. If you have good data prep will result in a good machine learning model being trained.

Analysis Slash evaluation, we've gone through a few of these evaluation metrics, but also not only the metric performance how long did training take how long does influence taken now and what are the costs involved with that for example, with the Tesla example.

We also had example twice but in <unk> case in that autonomy day talk.

This one here they said that their model takes 70000 GPU Alice to train now, let's just have a look at this GPA pricing.

If we were to rent a GPU now a GPU is a graphical processing unit, which is very fast at finding patents the numbers, which is what a machine learning model does.

If we were to rent at J P. One GPU is.

There we go to U S dollars at Tesla T fall, which is kind of an entry level GPU why don't we use of VB 100, that's actually a really fast GPU.

Let's say.

$101 46, so if we did 146 times, if we want to rent that J P. For 70000 hours, that's going to cost us $102000 200.

To train one machine learning model now of course, you all probably not training machine learning models, yet as biggest Tesla, but these are things youre going to want to take into consideration not just how well the model performance as the model performs at 99.9% accuracy, but cost of $1 million to train.

And you've only got a budget of $2 million, while you're not going to be able to try and very many models.

And inference is how long this model type to make a prediction.

So if our car is on the road.

And our machine learning model it could be 90 9999, 9% accurate.

But it takes 10 seconds to make a prediction do you think that's going to help.

Either caused driving on the road and I'd see that person coming up and say all right I'm, just making a prediction I am not sure. What it is gimme 10 seconds, but I can assure you that prediction will be very accurate.

Probably not going to fly.

So that's where you have to take training and inference cost into account the water if tool. So one of the big things about machine learning is that it's actually very hard to explain what your machine learning model is doing the what if tool helps you with that.

Visually probe the behavior of train machine learning models with minimal coding.

So I'll, let you check out that that's a big thing about machine learning is explain ability lease confident examples what does the model get wrong.

The bias and variance tradeoff, so high bias results in underwriting remember, we talked about underwriting before.

And a lack of a generalization to new samples high variance resulted in over fitting due to the model funding patents and the data which is actually just random noise.

Machine learning models are actually really good at finding patents in numbers, but sometimes they are so good they can find patents in what is just random noise in data.

So that's some analysis SaaS evaluation.

Now finally, we got served model and deploying a model.

After you've been through all of this you have collected some data you have prepared it you're trying to model you've done some analysis and evaluation you want to get it into the world. So if you're a Tesla you collected data you prep that you're trying to model you did some analysis after valuation you've even you've even got some new samples here that was very good and you've reached.

Turning to model now you want to serve it.

So how do you get a model like Intest with case, how do you get it to cause but in your case you might want to serve it through some sort of API, our web application or an iPhone application.

Serving our model is referred to usually referred to as deploying a model. So let's put the model into production and see how it goes.

Now.

You may have the best evaluation metrics in the World I K in a Jupiter notebook your model gets 99, 9% accuracy on whatever problem it's working.

But when you put it into an application.

Everything starts to change because youre going to get data sources people are going to use your model differently. That's a real test of a machine learning model now tools that you can use to do this again. This space is rapidly evolving. So chances are if you are watching this video in a few months these might be outdated, but right now intensify serving is going to be really good pie towards serving.

Google's AI platform you can make your model available as a rest API Sage, Mike, which is Amazon web services machine.

Machine learning deployment tool and then you've got ml ops, which is kind of this thing which is if you've heard of software engineering Dev ops MLR is Dev ops for machine learning. So it's basically all of the technology required around the machine learning model.

So all of this stuff here is basically tasteless data engine from collecting the data from the comp that would be one part of the operation to uploading it to a data source that would be another part of the operation and then modeling it that would be another part of the operation. This will be another part of the operation testing at collecting more data et cetera et cetera.

Our all in a loop.

How do you do that well is a great blog post hereby chip human I'll, let you read through that one, especially this part here this link here.

Really good.

Our guide to production level deep learning as you say the modeling code.

Actually not as big a part of some of the other things here. So make sure you check that out.

And then finally retraining of model once your model is deployed.

If you find that your model's predictions start to age not like a fine fine one.

Or drift that means that the data sources that your model is making predictions on have usually changed so they might have new hardware in the case of a Tesla car example, what if the camera has gone up guide and suddenly your machine learning model that was trained on lower resolution photos doesn't work as well on high resolution photos from the new <unk>.

<unk>.

So that's when you'll want to train retrain your model.

Far as that was a big dog one.

Machine learning process steps into machine learning project collect some data prepare the data trying to model on your data analyzed and evaluated that and then retrain our model. If we come back. That's an example of how Tesla would do it.

Now I actually did this and one of my own projects are about a month or so ago have collected some data.

From opening images, which is a data source.

Process at all actually sorry, I used a python script downloaded a stored it in Google storage.

Our prepared the data with a python script.

Trying to model using detection too with transfer learning by the way and to analyze and evaluate my model I used weitzen biases now if you are looking at all these things and go Whoa, Daniel we haven't covered any tools, yet don't worry we're going to get to tools.

<unk> analyzed and checked my experiments with Weitzen biases dashboard.

Made a user interface with Streamlet, which is <unk>.

Beautiful tool I wrap all of this in a docker container pushed the docker container to GCI and deployed my App with App engine. So this would be my machine learning ops machine learning operations for replicating Airbnb amenity detection with detection to now retraining our model I didn't actually.

But if I was I would find the worst performing classes. So this problem was using computer vision to detect amenities in photos.

Amenity detection.

This is what happened this is the problem in a nutshell.

Your air Bnb, and you want to find out what amenities AK, an oven or curdle or a sink or a badger air conditioner or in your listings and so one way to do it is to use computer vision.

And analyze each of the photos as somewhat upload them to your platform.

Scan for different things in there and then add it to your list of offenses. So if we come back to the presentation. That's what I did this was my process that I did if I wanted to retrain a model I would find the worst performing classes has signed my model soft at finding.

Chairs I'll get more photos of chairs feed that back into my Google storage for storing all the models that will top out at.

And really trying to model and then go back through this process you want to see how this was done there is a link there Deepak Dutt link slash Airbnb playlist.

Well I think.

We might be done.

With machine learning process as I said before.

This is not an exhaustive list, although we did cover a whole bunch. There. This video I don't actually have long sections have been going forward.

It's probably all of the stuff I've missed out. So if there was let me know, but as I will keep saying until I get a horse go through this explore it yourself if you want to find out anything more.

Have a look at these cool little graphics and see how you can tie it in with.

The information here.

So with that being said what are we up to.

So we have to number three machine learning tools have a little chilly out I'll be back in a second and we'll go through some machine learning tools that we can use to get the job done.

Alrighty, well early machine learning tools, let's check it out.

When I got here machine learning tools tools, you can use to get the job done want to be in a better idea to go and like a nice little circle and through mathematics, or whatever next month, we're already here boom. So broken these wanting to some libraries. So there is a pause in flavored.

I mean, a little bit there.

Libraries path and flavored now I'll say pause and fly of it but you can actually Rod machine learning code in any programming language.

Does or allows numerical computing a lot of these libraries here life cycle land parcel, which tends to flow and Python itself actually execute C code under the Hood, So really fast code, but when you first getting started chances are you're going to be interacting with one of these libraries of one or more of these libraries I'd actually argue.

Like if you're getting into deep learning you probably should know some high touch and tensor flow I mean, they have a lot of overlap, but we're getting to David at the moment, let's get back to the little bit of a presentation.

And we have another very colorful diagram that has been broken out into some sections. We got lotteries code Slash code space experiment tracking pre trained models data and model tracking cloud compute services hardware in case, you wanted to build your own deep learning machine rather than using a cloud compute service.

Automobile and hyper parameter tuning explain ability in case you wanted to get insights why did my model do that certain thing machine learning lifecycle.

Ml ops, that's what had been a nightmare user interface design at the moment streamline its really in a class of its own there is pretty good.

Now if we go through these I'm not going to go through each one of these but this is probably the main stack of you'll be using to write machine learning code Radhika Jupiter notebooks tend to flow. If you want to do tend to flow for the web you use in select Gis. If you wanted to do tend to flow on a mobile device or like a small computer like a raspberry Pi youll, probably houston to flow.

Paul.

So which is very similar tensorflow Python should lightning is.

Like a rapid <unk>.

And X is open neural network exchange, which basically converts you tend to flow upon to which model into another.

Another type of machine learning model that can be run across many different types of hardware, if you're thinking well I'm looking at all of the us and there's a lot here going on I mean ontology deny warn you have to start there is a lot to go through so darn boring. If everything has aims overwhelming rather than trying to learn all of these things in one hit best to start working on your own problems and figure.

We're out where these pieces of the puzzle fit themselves in so if you are writing colored intensive, Florida and you need to track. Your experiments you might use dashboard by Weitzen biases and if you want to pre try intensify model you Microdose Tensorflow hub and if you wanted to track the changes you're making to your data you might use weitzen biases artifacts and if you.

You want to cloud compute service, while Google Co lab is a free resource, but if you need a bigger amount of compute you might use Google cloud or AWS or Microsoft Azure and that is if you don't even have your own computing resources now truth be told as your first getting started you're probably only going to be using co lab to begin with but then.

It's very helpful. I'll recommend this in a learning resource to learn at least one cloud provider.

Youre getting really serious you might want to build your own deep learning computer, which is basically just a normal computer with a really big dog GPU.

We're not going to dive too much into that Ive got some links for those.

<unk> and high parameter tuning if you wanted to improve the settings on your model or if you want to build an automatically generated machine learning model. There are tools for that out there. It's very rare these days actually unless you're a hardcore research up to build your own machine learning model from scratch you would more so use transfer learning.

Auto ml, if you just want a proof of concept as a machine learning engineer working as a machine learning engineered consulting to other businesses, we would often build small proof of concept using auto ml. If they have any data because it was just easier to do that get it off the ground gets suddenly working show them. How it worked and then build out.

Custom features when required machine.

Machine learning lifecycle, you have tools like Q flow Seldon ml flow truth be told like this is still being really worked out like this is an evolving process unless you're a giant company who has their own custom machine learning loss cycle. These are open source tools here.

I mean, I think <unk> now gone into a paid offering but again, we're getting ahead of ourselves here. If you're just starting out you probably wont touch these tools until youre at least a year and at least building your own products or machine learning powered applications and then if you want to build some awesome proof of concepts use streamline I'm, just going to say that our web dash.

That someone can use on their own computer use streamlet deploy it with one of these cloud services and.

<unk> that was pretty over simplified but that's the main gist if at all.

So let's go back to our little flow chart here.

Our mind map around machine learning campus is what we've been calling it heavily so toolbox.

Jeremy tracking.

Now we've got tensor board, we've got a link here, let's have a look at what this looks like.

Basically as I said machine learning is going to be training, a whole bunch of different models and seeing which one works best Tensorflow intensive board helps you to figure that out.

Now if we go here you can also use dashboard from weights and biases.

Dashboard experiment tracking for machine learning models.

Our system of records a system of record.

Does that make sense to you anyway.

For your model results. So he have you imagine here you've got runs 210.

We've got all of these different options here and maybe these are all of your different results. There. So you might go okay, well this model up there that little one there that got the best accuracy, So I'm going to click on that one and say what what type of parameters got that best accuracy.

Net NII similar version of this I personally had hands on experience with Weitzen biases I actually love all of their offerings. So this is a big shout out to white, some biases, absolutely love, what they're making so make sure you check that out it is becoming basically a must have in mind machine learning workflow.

And it's not sponsored by Weitzen biases, I, just really loved the products they're building.

Pre trained models. So the fluid transfer learning you might want to go to Tensorflow hub.

Tensorflow hub is a library for reusable machine learning modules as I said, it's getting rare and rare these days to build a completely new machine learning model from scratch. This is how easy.

It actually has to incorporate like this one line of code or these two lines of code.

Incorporate a pre trained intensify model from TF Hubdub Dev.

So you can actually come here.

And your problem everyone image classification.

Okay image.

Image modules.

How cool is that.

And then the same thing is with par to a job now I believe part of Chubb, it's not as well refined as tensorflow hub, but its still got some pretty trying part towards models, great offering there hugging face Transformers. Now. This is transformer is a natural language processing architecture.

Now hugging face is a natural language processing research team.

And they have the biggest open source repository of Transformers.

So there you go you can read about all of them.

Right about what its Scott here.

Long story short if youre working on tax problems, you're probably going to want to look into using transformers of some sort.

To touch on too is Facebooks open source software.

Facebook a research for state of the art object detection algorithms. So a lot of computer vision gets done with detection to I've had hands on experience with that I am probably missing some things here. So if you have any more resources for transfer learning type models. Please let me know and I'll put them in here, but there is some of the main ones like this will take care of.

Almost 80% of your problems if you check in here.

Now.

It's one thing to track modeling experiments. So what changes have you made to your model and how does that affect performance. It's another thing to start tracking data. So remember in the machine learning process, we expand this big dog here.

We had data preparation so preparing your data to be trained for a model now when you collect data chances when you prepare it youre going to make some changes to it.

No.

You might want to keep track of the changes that you've made to your dataset that just makes sense right like if you've got the original dataset that you've collected and then you make a change on it you've got change one and the model does performance on that you've got model. One you make change too, but then you rerun model one on change to all of a sudden you.

Do that a couple of times and you're going to get very mixed up but artifacts by weights and biases remember, how I said I love Weitzen biases offering.

Now have a way to version your dataset. So track the changes that you've made to the data that you're using and then build reproducible machine learning models on those different dataset.

Awesome awesome tool.

And then you've got <unk>, which is I think it's open source actually.

Yeah. We go open source version control system for machine learning projects. So this is really like machine learning, it's hitting the pointy part where it's merging with software engineering remember at the start of this presentation, we talked about software one point O and machine learning being software two point on well the kind of merging into being one.

One thing but.

But data version control check that out if you need to track your datasets tracking models now.

Now cloud compute services you might be wondering why do I need a cloud compute service all I remember our overarching definition of what machine learning is its finding patents in numbers and then doing something with those patents, but to find those patents in numbers as fairly compute intensive which means it requires a lot of computing power.

But the good thing is Gpus, which is a graphical processing unit, what's the company's GPU.

Graphics processing unit.

Here we go.

Specialized electronic circuit designed to rapidly manipulate and Alta memory to accelerate the creation of images in a frame of Oh, well, that's our formal definition there.

What I want you to think of a GPU is just really fast processing numbers now if you don't have a GPU on your laptop my Macbook Pro what I'm actually using to record this has a GPU.

It's not an Nvidia GPU.

Africa and the video kind of have like how do you say monopoly on machine learning Gpus.

That's my macro approach GPU.

Wanted to use tensorflow or <unk>, one of these machine learning lotteries it doesn't work.

Without a lot of hacking very well with my macro approach GPU, So I need an Nvidia GPU.

Might be wondering how do I get one of those well Google Co lab provides one for free.

If we go to co lab.

I'm just going to show you the dog vision project that we went through.

Earlier.

If I wanted to.

Set up a GPU here aggregate change runtime type.

Hardware accelerated I can even use a TPU, which is a tensor processing unit of tensor is just like a neat package of numbers. So it's like rose columns and also dimensions that are going back like the Z dimension and then it can actually be an infinite amount of dimensions. So a computer is very good at visualizing things or understanding thing.

That has more than three dimensions, whereas we humans aren't very good at that.

Now if co lab, because it's free it has some limitations if that doesn't suffice. Your names you may want to look into.

Cloud compute service such as AWS, They are machine learning offering his sage Micah.

<unk> hi.

Machine learning for every developer and data scientist.

<unk> platform on <unk>, which is Google cloud platform, it's formerly called machine learning engine or Azure machine learning from Microsoft Azure Nowadays, okay, they're going to have different names and fancy different marketing terms that basically different versions of the same thing depending on where you work or.

Iran like personally I, just like Google cloud platform, because that's the one I've got most experience with I've used AWS in the path of worked with companies that have used Microsoft Azure.

It really just depends but basically these three big dogs Carnival and the cloud computing game, there's probably a few more out there that I haven't listed if you do have one you'd like me to recommend I think there is.

Floyd hub.

Another one.

<unk>.

Type of space, that's another one type of space.

Floyd hub deep learning platform.

Type of space.

Yes.

You could use this if you want.

I'm just.

Putting these there because they are the most common hardware. If you wanted to build your own deep learning PC that means youre doing some maybe you're building a machine learning startup. All you want you don't want to pay for cloud services every month.

These are the articles you should read.

You're basically going to need a GPU.

So I read this post by Tim Dennis.

And this article by Jeff Chen, which is arguing why you should actually build your own deep learning pace.

I think the math comes out toward if you're using cloud compute for a long time.

You want to build your own deep learning pace.

That's a sort of route I'm, taking on building mine I'm currently in the process of building our R&D funding pace.

Now.

Another tool in your toolbox is also ml.

Ooh also machine learning.

So remember how I said building a machine learning model like a custom machine learning model by hand is becoming more and more or less common and remember how I said brought it back at the start how machine learning is like figuring out or calculating the rules.

Figuring out the patents and your dataset why not use machine learning just to build itself.

That's the premise of auto ml you can also use that for high parameter training. So just like we would be retaining the dials on our oven cooking al Cecilia and grandmothers beautiful Roche chicken dish, we could actually use machine learning on that to figure out what the best set of hot parameters are the best settings on the Avenue. So we could just.

End to end machine learning the whole thing now there are a few different libraries that you can do.

For socket learned style.

<unk> learning algorithms you want to have a look at T. Part, we kind of touched on this before.

Really co pays and software what it does is a nice graphic here here. We go we will automate the most tedious part of machine learning by intelligently exploring thousands of possible pipelines and find the best one for your data how good is that that's a great sales speech.

Automated biotech part.

So this is what youre going to have to do youre going to have to collect the data cleanup get it ready for machine learning model.

Teapot will do all of these steps such as feature processing selection construction model selection and we'll also do parameter optimization than once it does all that for you you're going to have to valid items. So taper takes care of all of that that's pretty cool.

Google Cloud auto ml.

So you can come up here, if you don't have much machine learning expertise of how to write code.

Can just plug your data into <unk> cloud, you've got auto ml vision video intelligence natural language translation, even machine learning on tables automated unstructured data.

The downside is that you need to run API calls to Google for inference. So if you wanted to deploy your model and have it run locally on a small device, that's probably not a great option because you're going to have to you don't get to keep the machine learning model. If you chime in on auto ml, that's kind of a common theme with a lot of auto ml services.

Maybe that will be changed in the future Microsoft have an offering weitzen biases again have another offering Phil.

Scalable customizable high parameter training now.

And then just go to widen biases and check out their whole offering.

Products Amazing resources, if you want to learn machine learning check it out documentation could do with a little bit of improving but we're not going to complain Ali if you build some walls and stuff you're going to get popular.

Finally, there is carriers, China, probably a few more options here, but these are just some of the main ones carriers is a another deep learning library or framework that is now part of tensor flow.

So it's quite confusing, but if you go to <unk>.

You can use <unk> on its own.

Simple flexible powerful it sounds like me.

Set them simple or complex being.

You can run it on tensor flow too so it built on top of Tensorflow two point to deploy.

Deploy anywhere.

Tensorflow and carriers had a love child.

And if we go back.

Whereas it get started now again introduction carriage tuna is actually intensify documentation. So this is just going to help you.

<unk> Youll Cara's model, if we got.

Got some stuff here.

We don't have a great.

That's right.

Now.

Once youre trying to machine learning model you want to explain that we kind of touched on the what if tool before so what if tool helps you to compare different machine learning models. This is really important I like I said, we are deciding whether or not someone has heart disease.

You might want to know why the machine learning model predicted that.

So if we go here and what is it.

Compatible C.

Supported data in tough times binary classification, we've touched on that multi class classification regression tabular image.

This is the explain ability problems, you'll often hear in in machine learning that is a black box like you kind of parse a whole bunch of things to an algorithm the algorithm figures out the patents and then it just gives you an outlook, but you don't know patents, but the what if tool and shat values. So that's the other explain ability tool uses game theory to <unk>.

Explain the outputs of your machine learning models shaft is actually.

Pretty cool.

Interesting graphics like this coming out of it.

And it's all to help with explain why your machine learning model made the decision. It did so again, if youre predicting heart disease in machine learning. All goes you have heart disease and the doctors looking at well I don't really have any thing else to offer and accept that the models is the use of heart disease.

You probably want some information to go all well model noticed that your average heart rate is 145, and so your heart has debate really hard to pump because youre veins might be blocked to the brim with cholesterol.

Now again, I'm, making that up don't take that as a real explanation the <unk> and Shanghai is actually don't produce that but they can produce different things like why are certain feature contributed how much it to a certain predictions.

Alright, now machine learning lifecycle. This is probably going to be a little bit further on in your journey. When you first starting out youre going to go through.

I think.

Mostly just in this stage like this is how I have actually tried to stage a lot of this roadmap of compass whatever you want to call. It is the branches here and cascading effects Youll start with a library as you track. Your experiments you will look for pre train models or maybe you might do that before that.

That's probably better.

Data and model tracking.

That kind of comes under experiment tracking cloud compute services, but yes, once we get into the full blown ml machine learning launch cycle.

Want to look at things like ml flow through flow Seldon Streamlet also let's just have a look at one of these also that document that we just had to look at before.

Human ml ops, we had to look at this one before but this is actually this article only came out a couple of days ago. So it's it's really worth checking out.

<unk> is a great resource right at the top here.

This is a guide to production level deep learning.

Maybe I'll just put that in there.

Okay.

And I encourage you actually.

This roadmap isn't supporting units. If you think it can be improved just make you Ron this will actually help you.

What is it called.

Our guide to production level deep learning.

Our god to production level deep learning.

At hub repo.

If it doesn't suit your needs create your own intellectually you. Even if you just copy out what's on here and make sure you actually read the different things it will help.

To help with your understanding.

So ml, Florida, that's what we're going to look at an open source platform for the machine learning lifecycle ml.

Ml flow tracking projects Muddles registry.

Built integrations. These are all other <unk>.

Machine learning tools that you can use the rigor of Azure machine learning that Microsoft offers I mean offering Microsoft office.

<unk>, we mentioned that Google cloud Kubernetes, that's what coop flow is I imagine <unk> flow is just a workflow thats built on top of Kubernetes. If you don't know kubernetes is it's a framework for building containerized applications EG one container does one thing such as pre processing data another dozen other things such as.

Model data forget a coup flow.

The machine learning toolkit for kubernetes.

This is a lot to take him again this won't be.

They sort of tools wont be until later stage in your development, especially if you are just getting started with machine learning, but they're worth knowing about because at the end of the day. If you want to build things with machine learning. This is where you're going to end up if you want to build things that get into the hands of people. This is where you're going to end.

<unk>.

One of the most begin a friendly points here actually we'll put it up here is streamlined will begin a friendly tool sorry.

I've used this to create proof of concepts.

So streamlet the fastest wanting to build data apps look at that.

So say for example.

You were building, a model, which detects cars and pedestrians and traffic lights and photos of street photography. This is what Tesla could use to test out how they are different models are going now of course, I don't know what Tesla actually use but this is one extremely is intended for to go you've built a machine learning model now, let's build something that you can actually.

<unk> explore what that machine learning model has learned.

And look how easy it is to get started pick in store streamlined streamlet along two lines of code that you're running the terminal by the way.

There we go so far just ran pip install stray.

Streamlined I'm pretty sure I've already got it I'm not going to do that but that's a little challenge for you try that out tranche dreamland at highly recommend.

Alright.

Oh, and then I'll just put this blog post here.

So what I learned from looking at 200 machine learning tools now we've covered one maybe 'twenty here.

Now again. These are just the 20 that come to my mind over 20 that I have at least a little bit of experience hands on using so theres going to be many more there's definitely many more that I've missed but.

The tools themselves don't necessarily matter as much as what you do with them so don't be.

The typical problem for an engineer is to get obsessed with the tools rather than what they actually build with the tools. So.

Right back to the start we said we were going to explore this roadmap.

Cooks, what does a cook to order chef a chef.

Uses the controlled use of fire and a knife as their two major tools. So in fact, you can actually get majority of what you need done with something like just using fewer tensorflow and then build a model with that and then build a proof of concept with STREAMWAY you'd be surprised how far you'll get with just.

Those tools.

If you use tends to fly you probably also going to use tensor board. So that's that's probably three and may be weights and biases for Vega don't overload yourself.

Now.

We've gone through tools, what are we up to next.

Come back to our Cana I'm going to have a little a sip of water.

As you might have read by this this is the point for machine learning mathematics in this beautiful little Abacus emoji, there wasn't one for a calculator, but I said, let's have a look now machine learning mathematics. These are only some of the main ones and as I said I'm not going to be going into each of these in depth I'm, just really basically going to be list.

The names because one of the main questions I get people asking questions about machine learning is how do I learn to math and I think it's because math is actually talked pretty poorly in school.

I started learning machine learning I started to like math why more I mean, I've always been a numbers dude I like I like numbers, but when I started to learn machine learning and I realized Wow, it's actually just applied math and math is really just the language of nature. So machine learning is one part linear algebra, one pot matrix multiplication manipulation.

Especially if you're using neural networks, one part and I will be very a calculus. One part the chine role. This is basically the entirety of Backpropagation, which is how many neural networks.

<unk> basically or improve their errors is through backpropagation probability and distributions.

And then optimization. So again why don't we said finding patents of numbers, how do you find the optimal patent in numbers. So if you imagine this little top of the hill. That's the maximum usually in machine learning you're trying to find the minimums. So you're trying to reduce the loss function, which is usually something that measures how wrong your machine learning algorithms predict.

<unk> Saar to what they should be.

So that's an optimization problem, but if you.

Literally go to the Wikipedia page for each of these rate just what they are if you went through Huntsville, you might have covered them do you need to know the ins and outs of all of them to get started with machine learning No. My approach is right to machine learning code and then learn these parts when you have too.

So let's come back.

Machine learning mathematics, what's running under the Hood there again.

So linear algebra.

Creating objects in a set of rules to manipulate these objects AG X squared exited the object and squared is manipulating that object.

<unk> learning is about finding the right set of objects in the right set of rules to model a dataset well.

That's a pretty cool explanation, so I do say so myself.

Now again, you could read through this youre going to get a lot more complex explanation I can only fit I'm sure as you would imagine someone probably get angry at me for describing these and I'm a couple of sentences I can only fit a couple of sentences here.

So if you have a better explanation. Please let me know, but then we've got a linear algebra you want to read through that.

Look if you haven't done map in a while you're probably going to be like wall. That's a lot of things I don't understand but then what you're going to do is you're going to go three blue.

<unk> brand linear algebra.

And you're going to watch all of these.

And you probably again still want to understand it.

But then you're going to go to.

So Thomas.

Computational linear algebra, and then youre going to learn linear algebra with code.

Looking at this bad boy.

When we got code in here I think this just might be in <unk>.

There we go we got some code so.

Ongoing too.

Where is it.

Come back here.

I'm going to copy this.

I'm going to put this in the roadmap linear algebra.

Boom.

Actually they should probably go in resources, but that is a computational linear algebra well that's how I do it you can do whatever you want I'm not your boss.

That's how I go about certain things if I need to know topic I go to multiple different sources to figure out things are compressing knowledge in my own words into a couple of sentences and then start to build upon that.

They might be thinking Daniel when should I start learning all of these different mappings. While my biggest thing is can you solve the problem you're currently working.

This is no shortage amount of content here. This is a lot of math.

They do have overlap, yes, you're right, but if you just go out and try and align these willy nilly without having something to related to at least in my case I find it very hard like Thats just.

Pushing a boulder uphill when I am working on a problem and I am stuck on something then I find it really easy to learn something like.

Like matrix manipulation and.

In machine learning data all kinds of it often gets turned into rows columns and features features as the third dimension. It actually is the end dimension.

Which can actually be many dimensions the numbers actually forgot what I wrote these collections of numbers often referred to as matrices or tenses. So then.

I would go to something like that and then I got it here go to Wikipedia page I read through it I got it yes.

Thats matrix modification I try it out for myself.

So as I said I'm not going to really go through.

These other than just a high level overview of them, you'll probably want to notice in multi variant calculus probability distribution probability optimization. The chain rule again, how much can you solve the current problem youre working on you'll probably find that 99% of the time you can with code. That's in the engineering sense. If you do want to be in auto.

Actual intelligence researcher chances are youre going to have to know these in side out.

So get good at math read this book end to end the machine learning for Mathematics book My favorite resource for learning mathematics from machine Learning Hey, you go some of the things we've just talked about mathematical foundations.

You've also got the deep learning book faster AI deep learning from the foundations and various other resources.

The resources branch.

That's math in a nutshell. The reason why of gone pretty quickly over that is that I'm not taking away from mathematics, but for some reason the most common question again is how much math or one of the most common questions. We get how much math do I need to know the scared of learning math, because some high school teacher said that without a math or they didn't take it very well and.

They didn't tell them that math is actually the language of nature and its actually beautiful once you start to get into it. They look at the great symbols and like well count dual this but you actually can so my approach is to start learning code first and then learn math when it's required.

So if we come back to the keynote.

These are the topics that you're generally going to touch on if you want an overview again just read the Wikipedia page for each of these even if you don't understand it over time, you'll start to get used to seeing these different times these different symbols or whatever.

And you'll go Okay I'm reading this machine learning document and suddenly the Greek symbols don't look too out of place because I know that under the hood or this code that I'm writing a lot of this machine learning code is actually just executing mathematics for me.

Wonderful.

So let's go into the final step number five everyone's favorite machine learning resources, So you might be thinking.

Well, let's come back to the roadmap.

We've gone through problems, we've gone through the process, we've gone through tools and we've gone through the mathematics or at least just covered each of these in a high level overview you might be thinking this is great Daniel how do I learn all of these things well.

Going through that now.

So let's click here.

No.

This is a pretty big dog, one as well now again, we've got here noted theres a lot here. My advice is each of the resources are great, but you cant use them I'll choose a couple start with beginner if you're a beginner explore figure out what suits you and doesn't that's pretty powered a crime.

Like the videos instead of books, but eventually you will have to learn to love reading because put it this way.

So much videos can cover but books you can have a 700 page book like the one I'm holding in my hand, right now that you cannot say that says haynesville in machine learning with socket loan intensify carriers intensified to turn that into a video calls would be hundreds of hours.

<unk> learned to love reading.

And often the latest and greatest research is published in text phone not videophone.

I should just getting started except third materials here to be plenty enough to keep your content for two to three years being equivalent of an undergraduate degree.

Now I mean seriously that youll, probably see a lot of things online like non machine learning in six weeks sure. You can you can land. The overall concept, but if you want to get really serious but youll probably realize that once you get into it machine learning is not just about learning models. The majority of machine learning problems Theyre actually infrastructure problems. So software engineering meets machine learning and we've covered a few <unk>.

Things on software engineering, making machine learning remember that's called ml ops, so make sure to check out in that blog posts on MLR, but without any further ado, let's get into it.

Machine learning resources, if we come back I've actually made another little co graphic.

Where to start learning.

Ooh.

That's pretty cool. So if you are an absolute beginner expect this little flow chart here to take three to six months.

If youre an advanced learn us you've got some familiarity with all of these go through something like this expect this to take six to 12 months plus remember there is no rush here around if youre learning something worthwhile, it's going to take a while.

Now the bonus you can sprinkle in Dundee limit here on time is how much effort you put in so I may personally I can't really study more than four hours a day after that I've kind of found out that's just my limit mcbrien starts to go to mashed potato. So yes four hours per day is kind of what I've modeled this off if you can do two hours, you're probably looking towards the <unk>.

Randall this and maybe a bit more if you can do four hours, you're probably somewhere smack Bang in the middle of both of these numbers anyway, while you're doing that I'll be a good idea to start sprinkling in some of these so this is a missing semester. The missing part of your computer Science degree. This is going to teach you. This little curriculum here will take you a lot of the little parts that.

Machine learning courses tend to Miss out on and that is some just some computer science things you probably also want to choose one cloud provider that you get familiar with it.

Yeah.

As you might have seen before in my Airbnb project, because I knew how to use Google cloud I could get that my my code out into the world that means someone could access what I've done in a web browser now if you want to learn web development free code canceling one of the best resources on Mine Khan Academy, which is great for Matt <unk>.

<unk>, if you want to figure out off on state of the Art research you probably want to visit archive. That's all the technical papers of computer Science, physics, mathematics, and all that sort of stuff.

And if you want to version control your code, which is where you save the code that you've been writing to multiple versions. So if you cut it on day one when it breaks on data you can revert back to day one.

Yes.

If you want to add a book to this I would highly recommend part one of the hands on machine learning with <unk> carriers intend to fly for this little begin this section.

And part two for this advanced section. So what you might do is let me walk you through this you won't go machine learning concepts at Yamana ready, we've covered a lot of the concepts in this video and there are plenty of resources linked to the roadmap. So go through some of the concepts.

To learn these tools Python within Jupiter or Google Co lab, so python non pi for numerical Python. So remember machine learning is turning data into numbers and manipulating those numbers a lot a lot and I mean, a lot of data processing of numerical processing is based on how non pi processes.

Peter So.

That's why it's worth knowing <unk> is going to help you manipulate structured data so like tables, an excel spreadsheet top data. So I can't learn is got a whole bunch of machine learning functionality and machine learning algorithms spill team that you can use. So this is at least six months worth of work.

At the end of this you want to build a milestone project.

At least one and thats with streamlined of course, it doesn't have to be restrained late I'm just recommending streamlet, what we went through before.

Because it's actually a beautiful tool and just allows you to get some experience writing Python scripts.

So definitely checkout streamlined.

Remember this is like four hours a day. So if you can't do that just extend this timeframe.

Now <unk>.

Advanced I'd recommend something like this so once you've done all this go through fast AI part one tend to flow in practice from from deep learning to NII. After you've done those two you can do those two in parallel.

These two in parallel and you want to check out full stack deep learning, which is building a machine learning model and bringing it to production there you go.

Yes.

Loan production level deep learning from top practitioners.

That's what's up.

Then you got a milestone project too. So these are probably the two most important parts now you don't necessarily have to wait to do these.

Until you've learned some of this like you could do these as you're working through these like your whole learning journey could be purely just building projects, that's actually what I would recommend and at the same time is why youre going through this read part two and then sprinkle in days and the reason why I say sprinkle in is because.

It's hard to really know when youre going to need which parts of this youll smart enough to use your own best judgment as to when you might need some more computer science knowledge, when you might need a cloud provider and again, if I'm, saying these things and it sounds like Whoa tenure, you're going way too fast.

This will start to make more sense once you've gotten hands on there is no way that youre going to know what youll need from here until you've gotten hands on and.

Then if you want to learn this.

This is a plug for mine machine learning cost. This is what I teach in Davao Dot link slash ml cost Andre teaches the pause in part I teach the machine learning part so the machine learning concepts the machine learning tools and Andre going to teach you the path and path. So that's my business partner and if you want to learn as <unk>.

Again, a step check this out we designed it specifically for someone who's brand new to machine learning and wants to get started with with all of these tools.

These resources are actually also all here. So if you want concepts and processes. These are all free elements of AI is actually really cool.

All of Google's machine learning costs.

Is open source and free.

This is what's up.

Want to learn these things the resources are there so look at that machine learning crash course with tensor flow.

From Google.

If you don't even want to think about machine learning costs.

Just take Google's cost.

Google's <unk> education, Facebooks field guide to machine learning Facebook also do some massive amounts of machine learning.

All made with ml topics you also want to check out. This this is one of my favorite new websites and machine learning.

Made with ml.

Check this out if you want to learn Python.

It's got a whole bunch of topics, we can learn python.

If you want alone tends to flow.

It's got a whole bunch of machine learning basics tutorials hands on machine learning algorithms linear algebra linear regression decision trees convolutional neural networks transformers attention <unk> classification like and these are all like subsets of machine learning computer vision natural language processing, so definitely check this out.

There is no shortage of resources on all this stuff the best way to do it full stack there we go.

<unk>.

Full stack API stuck a web scraping if you wanted to collect your own data, you're probably looking to web scraping.

<unk>.

This is <unk>.

Phenomenal phenomenal resource.

But as I said it will require you to choose something if you are in doubt of what to choose choose anything and just go with the flow follow your own curiosity, that's how I always advise people to follow your own curiosity.

And then if you've got some skills you can test them on work era Dot AI.

Well it sounds pretty fun.

Standardized tests for AI skills to take the test.

This is after you've been learning machine learning for La <unk> II or whatever you can sign in here I think I have an account anyway, you take some test and evaluate.

Skill levels and different things and if you want to.

You can check out had a boost your scale is prepare for the machine learning tests prepare for the deep learning tests the data science test.

Figure out if you know about machine learning algorithms figure out if you know deep learning algorithms look at this amazing so loans and stuff from made with ml go through these topics and then test your skills on work era.

And then the real test of skills is actually not even any type of exam, it's actually what you've built.

Hello.

I'm on fire here, you're enjoying this live a comment below let me know if youre enjoying this.

Because I'm, having absolute lost here I hope you're finding some value out of this by the way.

I tried to put as much in heroes like codes. So it's just like a really just a one stop shop for getting started with machine learning begin up if you completely new start here we go.

If you completely new to machine learning stop our learnings in Python code first Okay. And then if you want to learn Python code you can learn Python in one video on Youtube by free code can you can do the zero to mastery part in cost. So this is full stack pattern top online business partner Andre from the zero to Master Academy look at that.

Zero to mastery Academy is this is a big disclaimer on part of this right. So.

Andre there we go <unk>.

Hey, Tejas path and part of the machine learning costs, but this is just to show you resources of way you can learn these things you don't even have to use anything by US. There is so much out there you've got parking lot humane, it which is more a python, but numerical computing so like stem applications.

There we go data analysis machine learning numerical work if you want to learn Python specifically for these type of things you probably want to go through that.

Tolling, you will need some tooling barebones to start writing parking code. So you might want to look into anaconda or condo apartment virtual am for managing all of your Python code Jupiter notebooks, as where your rights and explore machine learning code.

If you want to learn it all in one place the zero to Monster in machine Learning course forget here again as a disclaimer. This is what I made.

We've got all of these topics in here, you'll find that on the zero to Master Academy or are new to me.

You can also go to the casual learning center.

Foster data science education, I'm pretty sure. This is all free.

Datacom Dataquest now if you want example projects once you've got three to six months.

Plus I'll begin to work.

The next step is to go to the advanced path. So.

If we go here you want to also have done.

I can't stress this enough a milestone project no matter, what's an instructor, including myself or including any one from another call. So whatever no matter how much. They tell you about these things including this video.

It won't matter until you put it into practice I was going to say into project, but that didn't really make sense remember when your parents told you that the stove was hot I'm sure how many times mother and dad, a mother and father or whoever I looked after you. When you were growing up the stove as hot the stove as Haas don't touch it how did you figure out at the start was hot.

You touched it Ron you touched it you found out was hot didn't touch it again. So this is what our project is.

These concepts here are the equivalent of your mother and father, telling you that the stove as hot the project.

Is you touching the stove and figuring out okay. You've told me all of these things now it's my turn to touch the stove.

So once you've done some beginner stuff come into the advanced have a look at some end to end projects. What they look like this is probably before you go into the advanced stuff here.

This is probably what you would want to be working on is some projects like these so here's an example binary classification project that I did which is the heart disease one.

If you can like look through this all run the color of the true all of this going through the begin a path and understand it and run it yourself.

Then you're ready to go onto the dance and now on scrubbing through this and Theres a lot here.

But as I said remember this is a three to six month journey to get from zero to going through all of that.

Minimum.

But then you probably want to go through deep learning <unk> curriculum.

<unk> II curriculum I remember you can go through these simultaneously after that you want to look at full stack deep learning <unk> don't let your models die in Jupiter notebooks get them live into People's hands and see what you can do with machine learning. If I had my time again I would go projects first Dunkin' obsessed with the latest and greatest tool just build something that.

That brings value to someone to anyone.

And then at the same time, I'd, probably get proficient with at least one cloud service.

So again.

One of these to get proficient with.

Finally, what's missing.

From machine learning most machine learning Curriculums is general software engineering practices like Docker missing positive Youll see his degree if you wanted to go really really deep you could teach yourself computer science teachers.

Tejas Lcs dotcom.

Then we had the mathematics.

So linear algebra calculus matrix manipulation neural networks probability statistics, if you wanted to learn something from all of these you can go to the essence of linear algebra page.

Rachel Thomas'.

Computational linear algebra. This is linear algebra with code, let me put that CR rates.

So Thomas.

Computational linear algebra.

AD link.

There, we go and see this as a living and growing documents. So if you want to anything added lets me know Python, if you want to read some books on it dry automate the boring stuff with Python here are some of my favorite data Science books machine learning practices and code cannot recommend this book enough hands on machine learning with socket learned intensive flow.

On literally got that sitting right next to me to see them again, you lost purchase this autumn on 11th of April 2020.

Deep learning for coders by the fast II.

<unk>, Jeremy Howard and Sylvia and Yoga, that's coming out in July 2020. So this is actually a preview of his book look at that you come to this video and you get all these things that aren't even out yet come on Daniel building machine learning pipelines. This is more of a full stack machine learning stuff, that's actually coming out later this year too so stay tuned for that in.

Interpret all machine learning so that's explaining why your machine learning model is making the decisions that it's it's making.

Mathematics now the number one book we've already covered this is the mathematics for machine learning read that into and if you really want to get an overall concept. You can also get if you just want the bare bones matrix calculus, you need for deep learning.

This will come on by Jeremy Howard, who is the fast that AI instructor.

So if you come through here when you first read this you're probably going to look at these great symbols going whoa, that's a neural network there going whoa, what's happening here.

What is that what is that.

But as you start to read through this you'll start the same then we go power.

Power rule, some real difference rule product rule chain rule Youll start to say, okay. I can start to piece together these different bits and I can start to read these heavy equations and go yeah. Okay. The Santa make sense now truth be told if you ask me to go through this in rate at all I, probably cant what it takes me is instead.

I have to revisit these in line these things as I go as well so.

If you're if you're watching this video and thinking that on some sort of expert on all of this stuff.

Don't get too far ahead of yourself I'm still learning all of this as well.

<unk> cloud service.

Gary is probably one of the best places to go I've done a few clauses on there, especially getting certified with Google cloud highly recommend that.

And then Google Cloud, if you want to learn that AWS, Microsoft Azure member you only have to really choose one of these.

Some rules and tidbits all of these are my favorite kind of articles actually Andre could poppy. He actually did that Tesla autonomy day talk that we referenced at the start of this video.

Our recipe for training neural networks now.

Every so often we'll get left with the privilege of finding one of these amazing blog posts by a practitioner.

Okay.

So this is why my advice look at this ways number one become one with the data that's step one in back over here and now machine learning process in data preparation is exploratory data analysis learning about the data you're working with so coming back to.

The blog post.

Every so often we'll get blog posts like Andre capacities recipe for training neural networks practical advice of building deeper, though no networks will get things like these and now.

Although you might be thinking well just blown price they don't really like hard rules, but what did I say all throughout this video is that machine learning is experimental so it's only really after trying things a lot of times that you start to realize okay. This works. This doesn't work as ROIC. This doesn't work and so every so often we'll get blessed with a.

Beautiful blog posts like this and my advice is always.

Create your own blog.

Yes, you should have one now if we come up have a look.

You should read this article by Rachel Thomas Why you should have your own blog, but one of the main reasons is that so you can craft. These beautiful blocos and now I'm not asking you to write a blog posts like andrae capacity for training neural networks. This took probably 10 years' worth of experience to write something like this but write something for yourself.

Six months ago.

If you've just gone through this begin a pathway.

Right the things that you've learned or what's wrong with this pathway did I do it wrong well if he would.

Like to have known something when you started learning machine learning.

Right about it because the truth is there's enough resources out there for learning the code and whatnot, what there isn't enough resources out there is for the process around learning. These topics for the process around the bits and pieces that aren't just kind of like just the general how things fit together like this machine learning mind map.

So right things like that help your previous shelf and what youll be surprised to find is that a lot of people probably have the same questions that you do so if you want to create a blog try fast pages or github pages or medium theres, a whole bunch of different reasons that you can try but what.

Writing does is it shows you how poor youre thinking is so when you think you understand something try Ross about it and teach someone else about it.

So when you really start to understand it some.

Some bookmarks archived Saturday. So if you want to look through the latest machine learning research checkout. This little tool helps you.

<unk> some of the best off some of the most popular stuff.

Made with ml. So this is a community driven resource for your projects, we've actually just been through that but if you do make a machine learning project you should definitely posted there whether its something as simple as a blog post of 10 things.

Q2 2022 Palantir Technologies Inc Earnings Call

Request a DemoDemo

PLTR

Palantir Technologies

Earnings