Search

Users of Hadoop Clusters Get More Alternatives with Salesforce Wave Analytics

Users of the Salesforce Wave Analytics platform now can access their data stored in warehouses on the cloud or even in Hadoop clusters kept in-house far more easily with partnerships in place with Cloudera, Google, Hortonworks, New Relic to mention a few. Wave Analytics had been introduced by Salesforce to provide non-technical users, typically working in sales and marketing or customer service access to data responsible for driving customer satisfaction and loyalty. Self-Service Hitches Even though the NoSQL platform system had been received very well because of its schema-less data storage approach and better self-service, it was still not perfect as may be expected from a first-generation product even though it had been introduced only after years of research and development. One of the things that had bugged users was the complex process of uploading data to the platform, particularly from Hadoop and Bigtable. Even though Wave Analytics had been touted as a self-service platform, importing data from Hadoop required the use of difficult ETL processes well beyond the capabilities of non-technical users. Plugging the Gaps The growing discomfort felt by users due to this lack of functionality led to the launch of the data connectors recently. While the Wave platform can access big data in on-premises Hadoop clusters only after it is uploaded to the cloud, the partnerships with various vendors have led to the creation of automated procedures for enabling data extraction and uploading to the Wave platform. Even though it may be likely that some data will need to be transformed before it can be extracted and uploaded, the connectors have made life far easier for non-technical users, who hitherto only had access to change management tools like Flosum.com. Bridging the Gap between BI and Hadoop Users According to Salesforce, the technique adopted by them to go native on Hadoop comprises a Java program that is installed on the Hadoop cluster and which functions to transport the data to the cloud. As Salesforce Analytics Cloud senior vice president and general manager puts it, it was essentially a ‘last mile’ connection. According to him, the real challenge lays in delivering the huge amounts of data residing in big data platforms to the marketing personnel or customer service representatives who then can use the information to interact with customers in a more meaningful and value-added manner. This is exactly the functionality that the newly-announced big data platform connectivity promises to users. The Java program that Salesforce and its partners jointly developed helps users to get access to the data in their Hadoop clusters from Hortonworks, Cloudera, etc. While there are already a few customers working with this connector, Salesforce continues to work to integrate with Google more efficiently, even though there is already another integration method available currently. Partnerships for Data Transformation The $5-billion CRM market leader is simultaneously collaborating with data transformation software developers such as Trifacta and Informatica as well as New Relic, a developer of hosted analytics software to make accessible external data to users of Wave. Instead of undertaking the data transformation and cleanup manually that is not only tedious but also time-consuming, Wave customers can use the data transformation prebuilt by Trifacta and Informatica to clean up their data residing on Hadoop before uploading to the Wave platform on the cloud for analysis. Conclusion It is very clear that this sort of data integration facilities is the result of the growing realization that big data is not the sole domain of IT specialists but can lead to a boost in company productivity and better marketing and customer service when accessed by the customer-facing personnel who are not so technically competent. A new class of data-driven professionals is coming into existence with this ongoing universality of big data analytics.
Rated 4.0/5 based on 20 customer reviews

Users of Hadoop Clusters Get More Alternatives with Salesforce Wave Analytics

23K
Users of Hadoop Clusters Get More Alternatives with Salesforce Wave Analytics

Users of the Salesforce Wave Analytics platform now can access their data stored in warehouses on the cloud or even in Hadoop clusters kept in-house far more easily with partnerships in place with Cloudera, Google, Hortonworks, New Relic to mention a few. Wave Analytics had been introduced by Salesforce to provide non-technical users, typically working in sales and marketing or customer service access to data responsible for driving customer satisfaction and loyalty.

Self-Service Hitches

Even though the NoSQL platform system had been received very well because of its schema-less data storage approach and better self-service, it was still not perfect as may be expected from a first-generation product even though it had been introduced only after years of research and development.

One of the things that had bugged users was the complex process of uploading data to the platform, particularly from Hadoop and Bigtable. Even though Wave Analytics had been touted as a self-service platform, importing data from Hadoop required the use of difficult ETL processes well beyond the capabilities of non-technical users.

Plugging the Gaps

The growing discomfort felt by users due to this lack of functionality led to the launch of the data connectors recently. While the Wave platform can access big data in on-premises Hadoop clusters only after it is uploaded to the cloud, the partnerships with various vendors have led to the creation of automated procedures for enabling data extraction and uploading to the Wave platform. Even though it may be likely that some data will need to be transformed before it can be extracted and uploaded, the connectors have made life far easier for non-technical users, who hitherto only had access to change management tools like Flosum.com.

Bridging the Gap between BI and Hadoop Users

According to Salesforce, the technique adopted by them to go native on Hadoop comprises a Java program that is installed on the Hadoop cluster and which functions to transport the data to the cloud. As Salesforce Analytics Cloud senior vice president and general manager puts it, it was essentially a ‘last mile’ connection. According to him, the real challenge lays in delivering the huge amounts of data residing in big data platforms to the marketing personnel or customer service representatives who then can use the information to interact with customers in a more meaningful and value-added manner. This is exactly the functionality that the newly-announced big data platform connectivity promises to users.

The Java program that Salesforce and its partners jointly developed helps users to get access to the data in their Hadoop clusters from Hortonworks, Cloudera, etc. While there are already a few customers working with this connector, Salesforce continues to work to integrate with Google more efficiently, even though there is already another integration method available currently.

Partnerships for Data Transformation

The $5-billion CRM market leader is simultaneously collaborating with data transformation software developers such as Trifacta and Informatica as well as New Relic, a developer of hosted analytics software to make accessible external data to users of Wave. Instead of undertaking the data transformation and cleanup manually that is not only tedious but also time-consuming, Wave customers can use the data transformation prebuilt by Trifacta and Informatica to clean up their data residing on Hadoop before uploading to the Wave platform on the cloud for analysis.

Conclusion

It is very clear that this sort of data integration facilities is the result of the growing realization that big data is not the sole domain of IT specialists but can lead to a boost in company productivity and better marketing and customer service when accessed by the customer-facing personnel who are not so technically competent. A new class of data-driven professionals is coming into existence with this ongoing universality of big data analytics.

David

David Wicks

Blog Author

David Wicks is a senior Salesforce developer working at the forefront of new technologies. An articulate speaker and writer on big data analytics, Davidhas also written a series of insightful articles on aFlosum.com capabilities and advantages.
 

Join the Discussion

Your email address will not be published. Required fields are marked *

Suggested Blogs

Top In-demand Jobs During Coronavirus Pandemic

With the global positive cases for the COVID-19 reaching over two crores globally, and over 281,000 jobs lost in the US alone, the impact of the coronavirus pandemic already has been catastrophic for workers worldwide. While tourism and the supply chain industries are the hardest hit, the healthcare and transportation sectors have faced less severe heat. According to a Goldman Sachs report, the number of unemployed individuals in the US can climb up to 2.25 million. However, despite these alarming figures, the NBC News states that this is merely 20% of the total unemployment rate of the US. Job portals like LinkedIn, Shine, and Monster are also witnessing continued hiring for specific roles. So, what are these roles defining the pandemic job sector? Top In-demand Jobs During Coronavirus Pandemic Healthcare specialist For obvious reasons, the demand for healthcare specialists has spiked up globally. This includes doctors, nurses, surgical technologists, virologists, diagnostic technicians, pharmacists, and medical equipment providers. Logistics personnel This largely involves shipping and delivery companies that include a broad profile of employees, right from warehouse managers, transportation-oriented job roles, and packaging and fulfillment jobs. Presently, Amazon is hiring over 1,00,000 workers for its operations while making amends in the salaries and timings to accommodate the situation.  Online learning companies Teaching and learning are at the forefront of the current global scenario. With most of the individuals either working from home or anticipating a loss of a job, several of them are resorting to upskilling or attaining new skills to embrace broader job roles. The demand for teachers or trainers for these courses and academic counselors has also shot up. Remote learning facilities and online upskilling have made these courses much more accessible to individuals as well.  Remote meeting and communication companies The entirety of remote working is heavily dependant on communication and meeting tools such as Zoom, Slack, and Microsoft teams. The efficiency of these tools and the effectivity of managing projects with remote communication has enabled several industries to sustain global pandemic. Even project management is taking an all-new shape thanks to these modern tools. Moreover, several schools are also relying on these tools to continue education through online classes.  Psychologists/Mental health-related businesses Many companies and individuals are seeking help to cope up with the undercurrent. This has created a surge in the demand for psychologists. Businesses like PwC and Starbucks have introduced/enhanced their mental health coaching. Mental health and wellness apps like Headspace have seen a 400% increase in the demand from top companies like Adobe and GE.  Data analysts Hiring companies like Shine have seen a surge in the hiring of data analysts. The simple reason being that there is a constant demand for information about the coronavirus, its status, its impact on the global economy, different markets, and many other industries. Companies are also hiring data analysts rapidly to study current customer behavior and reach out to public sentiments.  How to find a job during the coronavirus pandemicWhether you are looking for a job change, have already faced the heat of the coronavirus, or are at the risk of losing your job, here are some ways to stay afloat despite the trying times.  Be proactive on job portals, especially professional networking sites like LinkedIn to expand your network Practise phone and video job interviews Expand your work portfolio by on-boarding more freelance projects Pick up new skills by leveraging on the online courses available  Stay focused on your current job even in uncertain times Job security is of paramount importance during a global crisis like this. Andrew Seaman, an editor at LinkedIn notes that recruiters are going by the ‘business as usual approach’, despite concerns about COVID-19. The only change, he remarks, is that the interviews may be conducted over a video call, rather than in person. If the outbreak is not contained soon enough though, hiring may eventually take a hit. 
Rated 4.5/5 based on 0 customer reviews
8419
Top In-demand Jobs During Coronavirus Pandemic

With the global positive cases for the COVID-19 re... Read More

5 Big Data Challenges in 2020

The year 2019 saw some enthralling changes in volume and variety of data across businesses, worldwide. The surge in data generation is only going to continue. Foresighted enterprises are the ones who will be able to leverage this data for maximum profitability through data processing and handling techniques. With the rise in opportunities related to Big Data, challenges are also bound to increase.Below are the 5 major Big Data challenges that enterprises face in 2020:1. The Need for More Trained ProfessionalsResearch shows that since 2018, 2.5 quintillion bytes (or 2.5 exabytes) of information is being generated every day. The previous two years have seen significantly more noteworthy increments in the quantity of streams, posts, searches and writings, which have cumulatively produced an enormous amount of data. Additionally, this number is only growing by the day. A study has predicted that by 2025, each person will be making a bewildering 463 exabytes of information every day.A report by Indeed, showed a 29 percent surge in the demand for data scientists yearly and a 344 percent increase since 2013 till date. However, the searches by job seekers skilled in data science continue to grow at a snail’s pace at 14 percent. In August 2018, LinkedIn reported claimed that US alone needs 151,717 professionals with data science skills. This along with a 15 percent discrepancy between job postings and job searches on Indeed, makes it quite evident that the demand for data scientists outstrips supply. The greatest data processing challenge of 2020 is the lack of qualified data scientists with the skill set and expertise to handle this gigantic volume of data.2. Inability to process large volumes of dataOut of the 2.5 quintillion data produced, only 60 percent workers spend days on it to make sense of it. A major portion of raw data is usually irrelevant. And about 43 percent companies still struggle or aren’t fully satisfied with the filtered data. 3. Syncing Across Data SourcesOnce you import data into Big Data platforms you may also realize that data copies migrated from a wide range of sources on different rates and schedules can rapidly get out of the synchronization with the originating system. This implies two things, one, the data coming from one source is out of date when compared to another source. Two, it creates a commonality of data definitions, concepts, metadata and the like. The traditional data management and data warehouses, and the sequence of data transformation, extraction and migration- all arise a situation in which there are risks for data to become unsynchronized.4. Lack of adequate data governanceData collected from multiple sources should have some correlation to each other so that it can be considered usable by enterprises. In a recent Big Data Maturity Survey, the lack of stringent data governance was recognized the fastest-growing area of concern. Organizations often have to setup the right personnel, policies and technology to ensure that data governance is achieved. This itself could be a challenge for a lot of enterprises.5. Threat of compromised data securityWhile Big Data opens plenty of opportunities for organizations to grow their businesses, there’s an inherent risk of data security. Some of the biggest cyber threats to big players like Panera Bread, Facebook, Equifax and Marriot have brought to light the fact that literally no one is immune to cyberattacks. As far as Big Data is concerned, data security should be high on their priorities as most modern businesses are vulnerable to fake data generation, especially if cybercriminals have access to the database of a business. However, regulating access is one of the primary challenges for companies who frequently work with large sets of data. Even the way Big Data is designed makes it harder for enterprises to ensure data security. Working with data distributed across multiple systems makes it both cumbersome and risky.Overcoming Big Data challenges in 2020Whether it’s ensuring data governance and security or hiring skilled professionals, enterprises should leave no stone unturned when it comes to overcoming the above Big Data challenges. Several courses and online certifications are available to specialize in tackling each of these challenges in Big Data. Training existing personnel with the analytical tools of Big Data will help businesses unearth insightful data about customer. Frameworks related to Big Data can help in qualitative analysis of the raw information.
Rated 4.5/5 based on 0 customer reviews
1261
5 Big Data Challenges in 2020

The year 2019 saw some enthralling changes in volu... Read More

Best ways to learn Apache Spark

If you ask any industry expert what language should you learn for Big Data? You will get an obvious reply to learn Apache Spark. Apache Spark is widely considered as the future of the Big Data industry. Since Apache Spark has stepped into Big data market, it has gained a lot of recognition for itself. Today, most of the cutting-edge companies like Apple, Facebook, Netflix, and Uber, etc. have deployed Spark at massive scale. In this blog post, we will understand why one should learn Apache Spark? And several ways to learn it. Apache Spark is a powerful open-source framework for the processing of large datasets. It is the most successful projects in the Apache software foundation. Apache Spark basically designed for fast computation, also which runs faster than Hadoop. Apache Spark can collectively process huge amount of data present in clusters over multiple nodes. The main feature of Apache Spark is its in-memory cluster computing that increases the processing speed of an application.Why You Should Learn Apache SparkApache Spark has become the most popular unified analytics engine for Big Data and Machine Learning. Enterprises are widely utilizing Spark which in turn is increasing demand for Apache Spark developers. Apache Spark developers are the ones earning the highest salary. IT professionals can leverage this upcoming skill set gap by pursuing a certification in Apache Spark. A developer with expertise in Apache Spark skills can earn an average salary of $78K as per Payscale. It is the right time for you to learn Apache Spark as there is a very high demand for Spark developers chances of getting a job is high.Here are the reasons why you should learn Apache Spark today:In order to go with the growing demand for Apache SparkTo fulfill the demands for Spark developersTo get benefits of existing big data investmentsResources to learn ReactTo learn Spark, you can refer to Spark’s website. There are multiple resources you will find to learn Apache Spark, from books, blogs, online videos, courses, tutorials, etc. With these multiple resources available today, you might be in the dilemma of choosing the best resource, especially in this fast-paced and swiftly evolving industry.BooksCertificationsVideosTutorials, Blogs, and TalksHands-on Exercises 1. BooksWhen was the last time you read a book? Do you have reading habits? If not, it’s the time to read the books. Reading has a significant number of benefits. Those aren’t fans of books might miss out the importance of Apache Spark. To learn Apache Spark, you can skim through the best Apache Spark books given below.Apache Spark in 24 hours is a perfect book for beginners which comprises 592 pages covering various topics. An excellent book to learn in a very short span of time. Apart from this, there are also books which will help you master.Here is the list of top books to learn Apache Spark:Learning Spark by Matei Zaharia, Patrick Wendell, Andy Konwinski, Holden KarauAdvanced Analytics with Spark by Sandy Ryza, Uri Laserson, Sean Owen and Josh WillsMastering Apache Spark by Mike FramptonSpark: The Definitive Guide – Big Data Processing Made SimpleSpark GraphX in ActionBig Data Analytics with SparkThese are the various Apache Spark books meant for you to learn. These books include for beginners and others for the advanced level professionals.2. Apache Spark Training and CertificationsOne more way to learn Apache Spark is through taking up training. Apache Spark Training will boost your knowledge and also help you learn from experience. You will be certified once you are done with training. Getting this certification will help you stand out of the crowd. You will also gain hands-on skills and knowledge in developing Spark applications through industry-based real-time projects.3. Videos:Videos are really good resources to help you learn Apache Spark. Following are the few videos will help you understand Apache Spark.Overview of SparkIntro to Spark - Brian ClapperAdvanced Spark Analytics - Sameer FarooquiSpark Summit VideosVideos from Spark Summit 2014, San Francisco, June 30 - July 2, 2013Full agenda with links to all videos and slidesTraining videos and slidesVideos from Spark Summit 2013, San Francisco, Dec 2-3-2013Full agenda with links to all videos and slidesYouTube playist of all KeynotesYouTube playist of Track A (Spark Applications)YouTube playist of Track B (Spark Deployment, Scheduling & Perf, Related projects)YouTube playist of the Training Day (i.e. the 2nd day of the summit)You can learn more on Apache Spark YouTube Channel for videos from Spark events. 4. Tutorials, Blogs, and TalksUsing Parquet and Scrooge with Spark — Scala-friendly Parquet and Avro usage tutorial from Ooyala's Evan ChanUsing Spark with MongoDB — by Sampo Niskanen from WellmoSpark Summit 2013 — contained 30 talks about Spark use cases, available as slides and videosA Powerful Big Data Trio: Spark, Parquet and Avro — Using Parquet in Spark by Matt MassieReal-time Analytics with Cassandra, Spark, and Shark — Presentation by Evan Chan from Ooyala at 2013 Cassandra SummitRun Spark and Shark on Amazon Elastic MapReduce — Article by Amazon Elastic MapReduce team member Parviz DeyhimSpark, an alternative for fast data analytics — IBM Developer Works article by M. Tim Jones 5. Hands-on ExercisesHands-on exercises from Spark Summit 2014 - These exercises will guide you to install Spark on your laptop and learn basic concepts.Hands-on exercises from Spark Summit 2013 - These exercises will help you launch a small EC2 cluster, load a dataset, and query it with Spark, Spark Streaming, and MLlib.So these were the best resources to learn Apache Spark. Hope you found what you were looking for. Wish you a Happy Learning!
Rated 4.5/5 based on 1 customer reviews
8915
Best ways to learn Apache Spark

If you ask any industry expert what language shoul... Read More

Useful links