[情報]FB alumni and current employees share
哈囉,跟大家推薦,絕對優質-Podcast-Iterative Venture
本次分享-EP4重點內容給大家:
Building Data-Driven Products with Data Science Founders Panel
絕對值得大家閱讀,滿滿的學習與心境分享,
若看完有興趣的大家請聽起來!並請支持創辦人個人FB 粉專
https://www.facebook.com/richardcheniterativeventure/
Podcast『每月』定期更新,歡迎訂閱收聽!並且推薦給你的朋友,
邀請Podcast大多數訪談來賓,來自有Facebook工作經驗員工
分享許多最精華和成長技術主題或創業、矽谷科技第一流資訊和工作經驗。
Podcast創辦人,本身曾經在矽谷-facebook從事:
Data Science, Data Engineering, and Backend Software Engineeering。
也非常榮幸被『 矽谷為什麼』-Podcast邀請訪談,
如何進入數據科學領域,成為科學家?
透過經營粉專,跟大家分享各種深入數據議題!
根據媒體文章報導,21世紀最性感的職業『數據科學家』。
數據科學家在做什麼嗎?
數據科學家,需要運用到那些技能、統計及構建預測模型?
為什麼需要數據?如何透過歷史數據做更精準、有效決策?
對於數據科學未來十年,如何做出決策評估風險的挑戰?
如何建溝一個更好的data分析評估,它的設計影響公司產品分析,
並根據data幫助企業實現目標和提出新的建議,
不同的商業模式,我們都需有更多洞察力解決問題。
【Podcast重點方向】
The panels shared their perspectives on the current landscape of the Data
Science startup ecosystem, the importance of data-driven decision-making
culture in building products, and tools that the founders in the panel are
building such as Statsig's Product Experimentation tool.
【內容討論】
【Iterative Venture Newsletter: Data Landscape in the 2020s】
( In our podcast-EP4 episode on discussing the latest data trends.)
"First reckon, then risk." - Von Moltke.
Why do we need data and historical perspective
Oftentimes, we overhear conversations that companies get acquired for their “data”, their data platform, or that we have clean (or unclean) “data” etc.
What are they trying to say? Why would a company get acquired purely for theirdata? What is data and why do we need them anyway?
The way we understand data is that data are sample points much like the
experiences we have in life. The more we see the world, the more data points wehave, and therefore the more apt we are to make decisions (hopefully good ones)based on the experiences we have had. This is no different for a company.
Thus, data is essential knowledge for the company and our brains are just verycomplex and efficient data infrastructures enabling us to store data, make
decisions, explain such decisions, and iterate after incorporating new
information.
To make good decisions, therefore we should have a lot of data points to learnfrom and be efficient to access them. This is what led to Facebook’s creationof Hive, Presto, Scuba, and the likes to suit different problems such as quickdata accessibility for debugging purposes and simple insights (scuba),
cost-effective big-data data warehouse (Hive) for data crunching, as well as
the mixture of the two (presto) so to have more “citizen data scientists,
as per Ashu.
The new decade
Humanizing AI
At the turning of the new decade, and with the rise of the optimism of what
machine models and AI can bring, so too comes the challenge of how to providean explanation into how such models make decisions. This is where Krishna's
company, Fiddler.ai, comes in.
(Fiddler.ai:https://www.fiddler.ai/ )
When it comes to complex machine learning models such as random forest,
XgBoost, and neural network, etc. the models can sometimes be so complex withintertwined permutations that explanation is simply impossible as there is nota single formula we can boil down to.
For reference, below is an example of how neural network decision works at a
very high level with each of the circle as a variable:
(Picture 2:Source: IBM)
Just how does it work? Many of us just take it as a black box.
We spoke(Podcast EP1) with Krishna a while ago about this and he said that thereason why he is solving this problem is that there is a monumental shift in
traditional industries such as banking and insurance.
Namely, there is a paradigm shift in the way how risks are assessed. In the
past, models were deterministic. That means we have a complex if-else switch
statement in place where we can evaluate whether someone is a risk and
therefore we should deny their credit card application. The good thing is thatwe know why we rejected someone. The downside is that our models are fixed andthe importance of each criterion is pre-determined.
Machine learning models on the other hand can easily incorporate new data andthe models can easily be swapped with other models and thus can offer a dynamicsolution to the problem.
Check out the podcast episode we had Krishna to learn more on how Fiddler is
solving the problem.
Faster Iterative Loop
Another challenge that tech companies face in the new decade is the
ever-competitive landscape as there are more and more entrants.
For reference, the number of apps hitting the Apple App Store is exponential.(Picture 3:Source:Statista (Note the time scale changes in 2020))
In this landscape, differentiation and the ability to learn and incorporate newchanges become absolutely key. To this point, there is a saying in Silicon
Valley that if there are two startups in competition against each other, the
one that iterates faster will win.
This is where Vijaye and Statsig (short for statistical significance) comes in.What Statsig is hoping to achieve is to allow any tech companies to easily setup the gatekeeper mechanism (the capability to show certain users a feature
that a company wants to roll out and not others for comparison purposes) and
the dashboard to collect relevant statistics for the control and test group inorder to better understand whether the product launch had achieved the intendedpurpose.
Furthermore, Statisig offers the capability for companies to gradually roll outtheir features in releases in a controlled fashion.
This not only achieves the goal of relying on the platform for release as wellas data collection but it also informs the relevant product teams of the effectof any of the releases thus inherently building in the data-driven culture asevery release can now be backed up by statistics and data.
Democratizing Data Technology
From a broader overview and from investors’ perspectives, there is also
general democratization of data technology as well as vertical integration forefficiency as per Ashu and Ravi.
As more and more folks coming from companies such as Google, Amazon, and
Facebook, are spreading the data-driven culture, more and more people are
realizing the power of data and are thus seeking to democratize data usage andthus allowing for more “citizen data scientists”.
One angle of such is to reduce the technical bar in order to achieve the sameresult. One prominent example that we came across in the past was Looker,
acquired by Google in 2019 for $2.6 billion, as Looker looks at data as objectsand thus allows for users to create LookML models (data objects) and to be
joined with other datasets thus enriching the data in a drag and drop fashionreducing the need of having to write complex joins via SQL queries.
On the other hand, having vertically integrated platforms also means that we
can have fewer people working on the same system as the systems can be
centrally configured without companies having to write custom software.
One prominent example of such is the rise of cloud technology where in the pasta company may have to set up their own data centers but now everything can beconfigured elastically via AWS, GCP, or Microsoft Azure.
Such trend is only accelerating as various buzzwords such as DataOps and MLOpsenter into our common use.
Just a matter of time
With various trends emerging, the challenges for many companies, especially
ones that are older, are that there are still a lot of data that reside in
on-premise data centers as per Ashu. Such data centers, because of archaic
technologies, do not lend themselves to easily transfer data across and thus
people have to physically remove the hard drives in order to copy over the data.On the other hand, while such trends are emerging in Silicon Valley and othertech hubs, many companies may not be able to afford the same level of
compensation for tech employees to justify the said employees to go to other
companies (a good number of which are legacy ones) and thus spread the
data-driven culture outside of Silicon Valley.
However, just with all problems, it is a matter time that such culture will
spread beyond the valley and I have no doubt that talented folks will come upwith new solutions that it may be both cheaper, more accessible, and easy to
use for companies of all kind to adopt a more data-driven solution and
ultimately, shape a more data-driven culture.
Podcast Link:
Spotify: https://lnkd.in/gYSm-Vvm
Apple: https://lnkd.in/g6kYiitq
Google: https://lnkd.in/gB_p7gMR
Any feedback would be appreciated
看到這裡的大家,非常謝謝,
如果這篇文章有幫助到大家學習,
歡迎至Facebook:
Richard Chen - Iterative Venture 按讚追蹤,
若有相關疑問,歡迎大家透過粉絲專頁私訊,也可以在底下留言:)
--
47
[北美] 科技類 Podcast 推薦:矽谷輕鬆談嗨大家好!想要分享我跟我老公經營七個月的科技類 Podcast 給大家 - 矽谷輕鬆談 Just Kidding Tech! 我是柯柯,現在在加州的金融科技公司 Intuit 擔任資料科學家,我老公 Kenji 則是在 Square 擔任軟體工程師。 為什麼會想要成立這個 Podcast 呢?因為通勤的關係,我們都成為了 Podcast 的重度使33
[分享] 科技類 Podcast 推薦:矽谷輕鬆談嗨大家好!想要分享我跟我老公經營七個月的科技類 Podcast 給大家 - 矽谷輕鬆談 Just Kidding Tech! 我是柯柯,現在在加州的金融科技公司 Intuit 擔任資料科學家,我老公 Kenji 則是在 Square 擔任軟體工程師。 為什麼會想要成立這個 Podcast 呢?因為通勤的關係,我們都成為了 Podcast 的重度使27
[心得] 矽谷金融科技公司 Square 四年工作心得Hey 大家好!我是矽谷輕鬆談 Just Kidding Tech 的 Podcast 主持人 Kenji,Square 是我在美國的第一份工作,這一待就是四年,在前幾天我正式畢業登出了!這四年讓我體 會到台灣跟矽谷公司的文化差異,公司對轉組很友善,前兩年我是做 Android 開發,負 責 Square 硬體產品上面的 Point of Sale app,後兩年轉到做 Traffic Infrastructure,使用 Envoy 開發 Service Mesh。16
[心得] 矽谷金融科技公司 Square 四年工作心得Hey 大家好!我是矽谷輕鬆談 Just Kidding Tech 的 Podcast 主持人 Kenji,Square 是我在美國的第一份工作,這一待就是四年,在前幾天我正式畢業登出了!這四年讓我體 會到台灣跟矽谷公司的文化差異,公司對轉組很友善,前兩年我是做 Android 開發,負 責 Square 硬體產品上面的 Point of Sale app,後兩年轉到做 Traffic Infrastructure,使用 Envoy 開發 Service Mesh。13
[北美] 矽谷金融科技公司 Square 四年工作心得Hey 大家好!我是矽谷輕鬆談 Just Kidding Tech 的 Podcast 主持人 Kenji,Square 是我在美國的第一份工作,這一待就是四年,在前幾天我正式畢業登出了!這四年讓我體 會到台灣跟矽谷公司的文化差異,公司對轉組很友善,前兩年我是做 Android 開發,負 責 Square 硬體產品上面的 Point of Sale app,後兩年轉到做 Traffic Infrastructure,使用 Envoy 開發 Service Mesh。8
[北美] 徵人part2 Data Scientist - Microsoft大家好, 我們組最近在招人. 對machine learning/NLP有興趣的. 想寫prototype或production code/python/c#的 machine learning engineer也歡迎謝謝. 對組有問題的可以寄信到我的信箱或是Email: commbigo@gmail.com6
[北美] 商業分析師職涯Podcast分享最近開了一個podcast,節目上會邀請來自各領域、各國的商業分析師和大家分享職涯 大多數的來賓都是在海外工作,也是以在北美工作為主! 歡迎訂閱收聽,會雙周一次更新一集 :) Apple: Spotify: