In this blog post, I will show how to import the data generated into Hive using just the resources available to you on AWS and SAP HANA One using the components highlighted in the diagram below. The result was the creation of three files for each month stored in the “s3://wikipedia-pagecounts-hive-results/” bucket in the directories named “year=2013/month=03”, “year=2013/month=04” and “year=2013/month=05”. In part 2 of this series “ Reducing the amount of working data for SAP HANA to process using Amazon Elastic MapReduce and Hive”, I showed how you can use Hive to come up with a reasonable working set of data for SAP HANA One to process. The beauty of using AWS EMR with S3 storage is that you can access the data in the S3 bucket after terminating the EMR cluster. ![]() If you are going to keep an EMR cluster with nine m1.xlarge EC2 instances running at 48 cents per hour per instance, you are looking at approximately $100 per day to keep the cluster alive. This means you need to keep your ASW EMR cluster active to run Sqoop. With Sqoop, the operative word here is having an “active” Hadoop cluster. Another way to import data from an active Hadoop cluster is through Apache Sqoop via a JDBC connection to your SAP HANA One database. The challenge with this approach is that Data Services is not easily available for AWS SAP HANA One users. SAP put together a video on using SAP BusinessObjects DataServices to import data from Hadoop into SAP HANA at. There are numerous ways to get data from AWS Elastic MapReduce (Hadoop) and Hive into Sap HANA One. To see how we got here, check out the first blog post in the series at “ Using SAP HANA to analyze Wikipedia data – Preparing the Data”. If you want to successfully access Amazon S3 accounts as if they were local storage drives, TntDrive is an excellent option.Welcome to part 3 in this series of demonstrating how to analyze Wikipedia page hit data using Hadoop with SAP HANA One and Hive. This is because the software uses very little memory and CPU utility. When used in the proper fashion, the software is fast without having a deep impact on the rest of the system's performance. TntDrive is exceptionally stable, so it rarely crashes, hangs, or produces any sort of error. You can even implement throttling of bandwidth to ensure data traffic remains sustainable. The software automatically checks for software updates on a regular basis, but you can disable this option if you would prefer to keep your current version. Thanks to the one-click activation, the software can be stopped and restarted again in an instant. You can view logs that have recorded all the activity for a certain period of time or other scale, completely reset the storage of cached data, run basic diagnostics if the system starts to act strangely, and much more. The list of available functions in TntDrive is quite vast. You can also change the duration for which cached data is saved. If you want to protect your data further, you can use the software to implement AES-256 encryption on the server side of the Amazon S3 Bucket account. With reduced redundancy storage, you'll optimize your used space. You can also make file names case sensitive to provide a wider range of file name options, and you can emulate attributes from the Wind32 system. For example, drives can be mounted in either read-only mode or removable drive mode. TntDrive doesn't skimp on the advanced features for those who have the knowledge or desire to use them. You can also set a number of other security measures for the drives, including a secret access key, the primary access key, the preferred letter of the drive, and much more. The compatible storage mediums with this software include standard Amazon S3 Storage, Amazon S3 in China, Amazon S3 GovCloud, and any other S3 compatible medium. With TntDrive, you are given the freedom to map any number of drives with no limit. From there, you can easily add mapped drives to your inventory of accessible drives. After the setup is complete, you'll be shown the Tntdrive user interface, which is nothing more than a simple window with standard architecture. This process shouldn't be difficult for you at all since the software essentially does all the work for you. Overall Opinion: When you first download and install TntDrive, the software will go through its automated setup process.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |