{"id":11567,"date":"2022-01-27T15:43:43","date_gmt":"2022-01-27T23:43:43","guid":{"rendered":"https:\/\/threecloud.wpengine.com\/?p=11567"},"modified":"2023-09-18T16:42:33","modified_gmt":"2023-09-18T23:42:33","slug":"databricks-for-dummies-what-is-scaling-out","status":"publish","type":"post","link":"https:\/\/3cloudsolutions.com\/resources\/databricks-for-dummies-what-is-scaling-out\/","title":{"rendered":"Databricks for Dummies: What is Scaling Out?"},"content":{"rendered":"<p><img decoding=\"async\" style=\"width: 709px;\" src=\"https:\/\/encrypted-tbn0.gstatic.com\/images?q=tbn%3AANd9GcT1jUvybT-juIhw__IXPEr7tD-QZzFgIYo5nA&amp;usqp=CAU\" alt=\"Machine learning and AI is changing how data science is leveraged\" width=\"709\" \/><\/p>\n<p>Keeping up with the latest tools and trends is a necessity to reap quality insights from your data, improve proactive decision making, and ensure your team is operating as efficiently as possible.\u00a0 With innovative thinking setting the pace, change is the only constant in technology. Unfortunately, diving into new platforms and methodologies to keep ahead of the curve typically entails navigating waters of mystifying buzzwords and unfamiliar jargon.<\/p>\n<p>Latest on your radar of must-watch tools might be a platform called <a href=\"https:\/\/ccganalytics.com\/company-overview\/partners\" rel=\" noopener\">Databricks<\/a>, powered by Apache Spark and capable of abstracting complex cluster management to scale out your machine learning and data engineering workloads, with intelligent optimizations to dynamically reallocate workers given computational demands.<\/p>\n<p>Sounds cool, right? We think so too. But what\u2019s it all mean? What\u2019s behind all the buzz about Databricks and \u201cscaling out\u201d anyways?<\/p>\n<h3 style=\"font-weight: bold;\">Scaling Out Versus Scaling Up<\/h3>\n<p>Imagine this- you have a task to complete, and you\u2019d like to figure out what sort of workers can complete this task for you as quickly as possible.\u00a0 Say, you\u2019d like to scour a massive stack of takeout menus and compile all of your options for restaurants with tacos to-go (a respectable endeavor).\u00a0 Currently, you only have one member working on your team, who flips through these menus fairly slowly.\u00a0 To speed this up, one option is to hire an expert menu reader, very experienced and very keen on identifying tacos, who can work at a much quicker pace. They demand a high premium for their services and consistently work at the same pace at the same hourly rate.<\/p>\n<p>Another option is to onboard an entire team of average menu readers, who do not demand a high rate but work quickly through sheer volume. Imagine also that these workers are flexible, and they can be individually relieved during a slump in work when you\u2019re collecting more takeout menus, unlike the expert who is kept on retainer.<\/p>\n<p><img decoding=\"async\" style=\"width: 385px; display: block; margin: 0px auto;\" src=\"https:\/\/f.hubspotusercontent20.net\/hubfs\/5670923\/image-Jul-02-2020-04-42-04-60-PM.png\" width=\"385\" \/><\/p>\n<p>The move from a slow worker to an expert is an example of scaling up, where the solution to working with large amounts of data and speeding up workloads is beefing up your compute through something like spinning up a larger virtual machine or adding a GPU. The flexible team of workers is akin to scaling out, during which additional, nonpermanent compute resources (also called workers, per Spark lingo!), can spin up and down as part of a connected \u201ccluster,&#8221; a team of connected workers.<\/p>\n<p>The task is different, but the mechanics are the same &#8211; <strong><a href=\"https:\/\/databricks.com\/\" rel=\" noopener\">Databricks <\/a>\u00a0can automatically allocate the compute resources necessary for your job, providing a more cost effective, largely flexible alternative to scaling up<\/strong>. Fortunately, the complexities that make scaling out historically daunting are actually the strong suits of this platform; Databricks abstracts all of the complicated setup and overhead that can precede taking advantage of clusters. Databricks not only comes with the optimization capabilities to dynamically scale during the completion of a task, but also the ability to schedule jobs and automatically start and stop all necessary workers to perhaps load in your forecasting model and make predictions, or perform transformations and aggregations on your newest available data.<\/p>\n<p>Given this wealth of features, gaining clarity around the buzz only positions <a href=\"https:\/\/databricks.com\/\" rel=\" noopener\">Databricks\u00a0 <\/a>further as an incredibly innovative tool, capable of simultaneously improving performance and cutting cost.<\/p>\n<p>If you\u2019d like to start letting compute clusters take care of your heavy lifting, begin taking advantage of cutting-edge scaling and scheduling tools, or simply automate your solution to takeout taco detection, Databricks and 3Cloud are <a href=\"https:\/\/3cloudsolutions.com\/get-started\/\">here to help<\/a>.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Keeping up with the latest tools and trends is a necessity to reap quality insights&mldr;<\/p>\n","protected":false},"author":72,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"content-type":"","footnotes":""},"categories":[260],"tags":[429,406],"class_list":["post-11567","post","type-post","status-publish","format-standard","hentry","category-data-ai","tag-data-and-ai","tag-databricks","topics-blog"],"acf":[],"_links":{"self":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts\/11567","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/users\/72"}],"replies":[{"embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/comments?post=11567"}],"version-history":[{"count":0,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/posts\/11567\/revisions"}],"wp:attachment":[{"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/media?parent=11567"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/categories?post=11567"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/3cloudsolutions.com\/wp-json\/wp\/v2\/tags?post=11567"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}