Search

Users of Hadoop Clusters Get More Alternatives with Salesforce Wave Analytics

Users of the Salesforce Wave Analytics platform now can access their data stored in warehouses on the cloud or even in Hadoop clusters kept in-house far more easily with partnerships in place with Cloudera, Google, Hortonworks, New Relic to mention a few. Wave Analytics had been introduced by Salesforce to provide non-technical users, typically working in sales and marketing or customer service access to data responsible for driving customer satisfaction and loyalty. Self-Service Hitches Even though the NoSQL platform system had been received very well because of its schema-less data storage approach and better self-service, it was still not perfect as may be expected from a first-generation product even though it had been introduced only after years of research and development. One of the things that had bugged users was the complex process of uploading data to the platform, particularly from Hadoop and Bigtable. Even though Wave Analytics had been touted as a self-service platform, importing data from Hadoop required the use of difficult ETL processes well beyond the capabilities of non-technical users. Plugging the Gaps The growing discomfort felt by users due to this lack of functionality led to the launch of the data connectors recently. While the Wave platform can access big data in on-premises Hadoop clusters only after it is uploaded to the cloud, the partnerships with various vendors have led to the creation of automated procedures for enabling data extraction and uploading to the Wave platform. Even though it may be likely that some data will need to be transformed before it can be extracted and uploaded, the connectors have made life far easier for non-technical users, who hitherto only had access to change management tools like Flosum.com. Bridging the Gap between BI and Hadoop Users According to Salesforce, the technique adopted by them to go native on Hadoop comprises a Java program that is installed on the Hadoop cluster and which functions to transport the data to the cloud. As Salesforce Analytics Cloud senior vice president and general manager puts it, it was essentially a ‘last mile’ connection. According to him, the real challenge lays in delivering the huge amounts of data residing in big data platforms to the marketing personnel or customer service representatives who then can use the information to interact with customers in a more meaningful and value-added manner. This is exactly the functionality that the newly-announced big data platform connectivity promises to users. The Java program that Salesforce and its partners jointly developed helps users to get access to the data in their Hadoop clusters from Hortonworks, Cloudera, etc. While there are already a few customers working with this connector, Salesforce continues to work to integrate with Google more efficiently, even though there is already another integration method available currently. Partnerships for Data Transformation The $5-billion CRM market leader is simultaneously collaborating with data transformation software developers such as Trifacta and Informatica as well as New Relic, a developer of hosted analytics software to make accessible external data to users of Wave. Instead of undertaking the data transformation and cleanup manually that is not only tedious but also time-consuming, Wave customers can use the data transformation prebuilt by Trifacta and Informatica to clean up their data residing on Hadoop before uploading to the Wave platform on the cloud for analysis. Conclusion It is very clear that this sort of data integration facilities is the result of the growing realization that big data is not the sole domain of IT specialists but can lead to a boost in company productivity and better marketing and customer service when accessed by the customer-facing personnel who are not so technically competent. A new class of data-driven professionals is coming into existence with this ongoing universality of big data analytics.

Users of Hadoop Clusters Get More Alternatives with Salesforce Wave Analytics

23K
Users of Hadoop Clusters Get More Alternatives with Salesforce Wave Analytics

Users of the Salesforce Wave Analytics platform now can access their data stored in warehouses on the cloud or even in Hadoop clusters kept in-house far more easily with partnerships in place with Cloudera, Google, Hortonworks, New Relic to mention a few. Wave Analytics had been introduced by Salesforce to provide non-technical users, typically working in sales and marketing or customer service access to data responsible for driving customer satisfaction and loyalty.

Self-Service Hitches

Even though the NoSQL platform system had been received very well because of its schema-less data storage approach and better self-service, it was still not perfect as may be expected from a first-generation product even though it had been introduced only after years of research and development.

One of the things that had bugged users was the complex process of uploading data to the platform, particularly from Hadoop and Bigtable. Even though Wave Analytics had been touted as a self-service platform, importing data from Hadoop required the use of difficult ETL processes well beyond the capabilities of non-technical users.

Plugging the Gaps

The growing discomfort felt by users due to this lack of functionality led to the launch of the data connectors recently. While the Wave platform can access big data in on-premises Hadoop clusters only after it is uploaded to the cloud, the partnerships with various vendors have led to the creation of automated procedures for enabling data extraction and uploading to the Wave platform. Even though it may be likely that some data will need to be transformed before it can be extracted and uploaded, the connectors have made life far easier for non-technical users, who hitherto only had access to change management tools like Flosum.com.

Bridging the Gap between BI and Hadoop Users

According to Salesforce, the technique adopted by them to go native on Hadoop comprises a Java program that is installed on the Hadoop cluster and which functions to transport the data to the cloud. As Salesforce Analytics Cloud senior vice president and general manager puts it, it was essentially a ‘last mile’ connection. According to him, the real challenge lays in delivering the huge amounts of data residing in big data platforms to the marketing personnel or customer service representatives who then can use the information to interact with customers in a more meaningful and value-added manner. This is exactly the functionality that the newly-announced big data platform connectivity promises to users.

The Java program that Salesforce and its partners jointly developed helps users to get access to the data in their Hadoop clusters from Hortonworks, Cloudera, etc. While there are already a few customers working with this connector, Salesforce continues to work to integrate with Google more efficiently, even though there is already another integration method available currently.

Partnerships for Data Transformation

The $5-billion CRM market leader is simultaneously collaborating with data transformation software developers such as Trifacta and Informatica as well as New Relic, a developer of hosted analytics software to make accessible external data to users of Wave. Instead of undertaking the data transformation and cleanup manually that is not only tedious but also time-consuming, Wave customers can use the data transformation prebuilt by Trifacta and Informatica to clean up their data residing on Hadoop before uploading to the Wave platform on the cloud for analysis.

Conclusion

It is very clear that this sort of data integration facilities is the result of the growing realization that big data is not the sole domain of IT specialists but can lead to a boost in company productivity and better marketing and customer service when accessed by the customer-facing personnel who are not so technically competent. A new class of data-driven professionals is coming into existence with this ongoing universality of big data analytics.

David

David Wicks

Blog Author

David Wicks is a senior Salesforce developer working at the forefront of new technologies. An articulate speaker and writer on big data analytics, Davidhas also written a series of insightful articles on aFlosum.com capabilities and advantages.
 

Join the Discussion

Your email address will not be published. Required fields are marked *

Suggested Blogs

How Big Data Can Solve Enterprise Problems

Many professionals in the digital world have become familiar with the hype cycle. A new technology enters the tech world amid great expectations. Undoubtedly, dismay sets in and retrenchment stage starts, practice and process catch up to assumptions and the new value is untied. Currently, there is apparently no topic more hyped than big data and there is already no deficit of self-proclaimed pundits. Yet nearly 55% of big data projects fail and there is an increasing divide between enterprises that are benefiting from its use and those who are not. However, qualified data scientists, great integration across departments, and the ability to manage expectations all play a part in making big data work for your organization. It is often said that an organization’s future is dependent on the decisions it takes. Since most of the business decisions are backed by data available at hand. The accurate the information, the better they are for the business. Gone are the days when data was only used as an aid in better decision making. But now, with big data, it has actually become a part of all business decisions. For quite some time now, big data has been changing the way business operations are managed, how they collect data and turn it into useful and accurate information in real-time. Today, let’s take a look at solving real-life enterprise problems with big data. Predictive Analysis Let’s assume that you have a solid knowledge of the emerging trends and technologies in the market or when your infrastructure needs good maintenance. With huge amounts of data, you can easily predict trends and your future needs for the business. This sort of knowledge gives you an edge over your peers in this competitive world. Enhancing Market Research Regardless of the business vertical, market research is an essential part of business operations. With the ever-changing needs and aspirations of your customers, businesses need to find ways to get into the mind of customers with better and improved products and services. In such scenarios, having large volumes of data in hand will let you carry out detailed market research and thus enhancing your products and services. Streamlining Business Process For any enterprise, streamlining the business process is a crucial link to keeping the business sustainable and lucrative. Some effective modifications here and there can benefit you in the long run by cutting down the operational costs. Big data can be utilized to overhaul your whole business process right from raw material procurement to maintaining the supply chain. Data Access Centralization It is an inevitable fact that the decentralized data has its own advantages and one of the main restrictions arises from the fact that it can build data silos. Large enterprises with global presence frequently encounter such challenges. Centralizing conventional data often posed a challenge and blocked the complete enterprise from working as one team. But big data has entirely solved this problem, offering visibility of the data throughout the organization. How are you navigating the implications of all that data within your enterprise? Have you deployed big data in your enterprise and solved real-life enterprise problems? Then we would love to know your experiences. Do let us by commenting in the section below.
14968
How Big Data Can Solve Enterprise Problems

Many professionals in the digital world have becom... Read More

Analysis Of Big Data Using Spark And Scala

The use of Big Data over a network cluster has become a major application in multiple industries. The wide use of MapReduce and Hadoop technologies is proof of this evolving technology, along with the recent rise of Apache Spark, a data processing engine written in Scala programming language. Introduction to Scala Scala is a general purpose object-oriented programming language, similar to Java programming. Scala is an acronym for “Scalable language” meaning its capabilities can grow along the lines of your requirements & also there are more technologies built on scala. The capabilities of Scala programming can range from a simple scripting language to the preferred language for mission-critical applications. Scala has the following capabilities: Support for functional programming, with features including currying, type interference, immutability, lazy evaluation, and pattern matching. An advanced type system including algebraic data types and anonymous types. Features that are not available in Java, like operator overloading, named parameters, raw strings, and no checked exceptions. Scala can run seamlessly on a Java Virtual Machine (JVM), and Scala and Java classes can be freely interchanged or can refer to each other. Scala also supports cluster computing, with the most popular framework solution, Spark, which was written using Scala. Introduction to Apache Spark Apache Spark is an open-source Big Data processing framework that provides an interface for programming data clusters using data parallelism and fault tolerance. Apache Spark is widely used for fast processing of large datasets. Apache Spark is an open-source platform, built by a wide set of software developers from over 200 companies. Since 2009, more than 1000 developers have contributed to Apache Spark. Apache Spark provides better capabilities for Big Data applications, as compared to other Big Data technologies such as Hadoop or MapReduce. Listed below are some features of Apache Spark: 1. Comprehensive framework Spark provides a comprehensive and unified framework to manage Big Data processing, and supports a diverse range of data sets including text data, graphical data, batch data, and real-time streaming data. 2. Speed Spark can run programs up to 100 times faster than Hadoop clusters in memory, and 10 times faster when running on disk. Spark has an advanced DAG (directed acrylic graph) execution engine that provides support for cyclic data flow and in-memory data sharing across DAGs to execute different jobs with the same data. 3. Easy to use With a built-in set of over 80 high-level operators, Spark allows programmers to write Java, Scala, or Python applications in quick time. 4. Enhanced support In addition to Map and Reduce operations, Spark provides support for SQL queries, streaming data, machine learning, and graphic data processing. 5. Can be run on any platform Apache Spark applications can be run on a standalone cluster mode or in the cloud. Spark provides access to diverse data structures including HDFS, Cassandra, HBase, Hive, Tachyon, and any Hadoop data source. Spark can be deployed as a standalone server or on a distributed framework such as Mesos or YARN. 6. Flexibility In addition to Scala programming language, programmers can use Java, Python, Clojure, and R to build applications using Spark. Comprehensive library support As a Spark programmer, you can combine additional libraries within the same application, and provide Big Data analytical and Machine learning capabilities. The supported libraries include: Spark Streaming, used for processing of real-time streaming data. Spark SQL, used for exposing Spark datasets over JDBC APIs and for executing SQL-like queries on Spark datasets. Spark MLib, which is the machine learning library, consisting of common algorithms and utilities. Spark GraphX, which is the Spark API for graphs and graphical computation . BlinkDB, a query engine library used for running interactive SQL queries on large data volumes. Tachyon, which is a memory-centric distributed file system to enable file sharing across cluster frameworks. Spark Cassandra Connector and Spark R, which are integration adapters. With Cassandra Connector, Spark can access data from the Cassandra database and perform data analytics. Compatibility with Hadoop and MapReduce Apache Spark can be much faster as compared to other Big Data technologies. Apache Spark can run on an existing Hadoop Distributed File System (HDFS) to provide compatibility along with enhanced functionality. It is easy to deploy Spark applications on existing Hadoop v1 and v2 cluster. Spark uses the HDFS for data storage, and can work with Hadoop-compatible data sources including HBase and Cassandra. Apache Spark is compatible with MapReduce and enhances its capabilities with features such as in-memory data storage and real-time processing. Conclusion The standard API set of Apache Spark framework makes it the right choice for Big Data processing and data analytics. For client installation setups of MapReduce implementation with Hadoop, Spark and MapReduce can be used together for better results. Apache Spark is the right alternative to MapReduce for installations that involve large amounts of data that require low latency processing
26716
Analysis Of Big Data Using Spark And Scala

The use of Big Data over a network cluster has bec... Read More

Big Data Analytics: Challenges And Opportunities

Collecting data and deciphering critical information from it is a trait that has evolved with human civilization. From prehistoric data storage that used tally sticks to the current day sophisticated technologies of Hadoop and MapReduce, we have come a long way in storing and analysing data. But how much do we need to innovate and evolve to store this massive, exploding data? And with so many business decisions riding on it, will we able to overcome all the Big Data challenges and come out successful? Today is an age where we use Ethernet hard drives and Helium filled disks for data storage. But the idea that information can be stored was formulated and put to use centuries before the first computer was even built. Libraries were the among the first mass storage areas and were built on massive scales to store the ever-growing data. With time, more targeted devices were invented such as punch cards, magnetic drum memory, and cassettes. The latter half of the 20th century saw huge milestones in the field of data storage. From the first hard disk drive invented by IBM to laser discs, floppy discs and CD-ROMs, people realized that digital storage was more effective and reliable than paper storage. During all this time experts were lamenting the fact that abundant amounts of information were simply being ignored when they could provide good commercial insights. But it was not until the invention of the internet and the rise of Google that this fact came to be truly appreciated.While data was always present, its velocity and diversity have changed and it was imperative to make use of it. This abundant data now had a name—Big data and organizations were realizing the value of analysing it and using it to derive deep business insights which could be used to take immediate decisions. So what exactly is Big Data? The classic definition of big data is that it is large sets of data that keeps increasing in terms of size, complexity, and variability. Analysing enormous amounts of data could help make business decisions that lead to more efficient operations, higher productivity, improved services and happier customers. 1. Lower costs: Across sectors such as healthcare, retail, production, and manufacturing Big Data solutions are help reducing costs. For example, a survey by McKinsey & Company found that the use of Big Data analytics in the healthcare industry could save upto $450 billion in America. Big data analytics can be used to identify and suggest treatments based on patient demographics, history, symptoms and other lifestyle choices. 2. New innovations and business opportunities: Analytics gives a lot of insight on trends and customer preferences. Businesses can use these trends to offer new products and services and explore revenue opportunities. 3. Business Proliferation: Big Data is currently used by organizations for customer retention, product development and improvement of sales all of which lead to business proliferation and give organizations a competitive advantage. By analysing social media platforms they can gauge customer response and roll out in-demand products. But all said and done, how many organizations are able to actually implement Big Data Analytics and gain profits from it? The challenge for organizations who have not yet implemented Big Data into their operations is; how to start? And for those who have already implemented is; how to go about it? Analysts have to come up with infrastructure, logistics and architectural changes to fit in Big Data and present results in such a way that stakeholders are able to make real time business decisions. 4. Identifying the Big data to use: Identifying which data to use is key to deciding if your Big Data programme will be a success or failure. Data is exploding from all directions. Internally from customer transactions, sales, supply chain, performance data etc. and external data such as competitive data, data from social media sites, and customer feedback. The trick is to identify which data to get, how to get it and how to integrate it to make sense and affect business outcomes. 5. Making Big Data Analytics fast: Relevant data needs to be identified quickly to be of value. This requires high processing speeds that can be achieved by installing hardware that can process large amounts of data extremely quickly. 6. Understanding Big data: Your machines are superfast and you have all the required data. But does it make sense to you? Can your management take decisions based on that data? Understanding and interpreting the data is an important parameter of using Big Data and this requires relevant expertise, skilled personnel who understand where the data comes from and how to interpret it. To handle the constantly changing variables of Big Data, organizations need to invest in accurate data management techniques that will allow them to choose and use only the information that will yield business benefits. This is where Big Data technologies come into the picture. These advanced technologies such as Hadoop, PIG, HIVE and MapReduce help extract high-velocity, economically viable data that ultimately deliver value to the organization.
11526
Big Data Analytics: Challenges And Opportunities

Collecting data and deciphering critical informati... Read More

Useful links