Using the feature of SAP HANA Smart Data Access(SDA), it is possible to access remote data, without having to replicate the data to the SAP HANA database beforehand. The following are supported as sources(till 2013): - Teradata database, - SAP Sybase ASE, - SAP Sybase IQ, - Intel Distribution for Apache Hadoop, - SAP HANA.
SAP HANA handles the data like local tables on the database. Automatic data type conversion makes it possible to map data types from databases connected via SAP HANA Smart Data Access to SAP HANA data types.
This guide will explain the step-by-step approach SAP HANA SDA for Hadoop data - which also include the following : - Hadoop Installation - Data Load in Hadoop system - Activities on Unstructured Data in Hadoop system - ODBC Driver installation & configuration on HANA Server for Hadoop system data access - Smart Data Access in SAP HANA (through SAP HANA Studio), using HADOOP as a remote data source
Setup used for this guide : 1) Hadoop : HDP 1.3 for Windows(Hortonworks Data Platform) - Standalone - on Dell Laptop, OS Win7 64bit with 8GB RAM 2) SAP HANA Sever : running on VM – 24GB Standalone HANA 1.0 SPS 7 – SLES 11 SP1
Note- this has been created Jan-Feb'2014 timeframe.
Another use case, I have done for Employee Designation used in various countries for a global company and analysis result looks like the following in SAP HANA Studio(data is coming from Hadoop using SDA) :
Hadoop Data Analysis using SAP HANA