About Wellcome Department Store Co., Ltd.
Founded in 1945, Wellcome is Hong Kong’s longest-established supermarket chain. Since 1964, the company has been wholly owned by Dairy Farm International Holdings following the merger with rival supermarket Dairy Lane. Wellcome has an overall staff of 5,000 in more than 240 stores and serves more than 14 million customers every month.
Business Needs and Challenges
In the past, the customer has a on-premise Business intelligence process but the BI team was suffered by delayed reporting by their top business management. Their ETL process on-premise requires more than 24 hours to complete, therefore the top business management are unable to get the daily sales performance reports of individual store, individual product and individual region’s revenue and profit.
Aims to have the End-to-end Extract, Load, and Transform (ELT) design and implementation of Wellcome sales transaction data which generate transformed table for Tableau dashboard to use on a daily basis.
Solutions Provided by eCloudvalley
To resolve this challenge, eCloudvalley(ECV) would like to leverage AWS as the new automated report generation platform to perform added retail analytic capability and shorten the data processing time by leveraging Redshift’s massive parallel processing power, together with Tableau so that senior management no longer needs to wait for .pdf files but they can access to sales and retail analytic dashboard at any time.
And here are the capabilities AWS can achieve:
Redshift is a cloud-based service and flexible to meet Wellcome’s demand by quickly scaled up or down, Wellcome will not need to invest a huge amount of money into hardware and staff with expert skills.
Perform ETL process near real-time. AWS ETL service support multiple third party service as data source and automation, handle the wide variety of the data sources, build a data pipeline with cloud-based service.
Keep cost relatively low. As a cost-effective solution, AWS Redshift provide workloads from minimize nodes to maximize node depend on the customer’s requirements. Furthermore, customer can choose the type of pricing model they prefer: on-demand or reserved instances.
Any business logic or business problem is generated by the data so we make efforts to communicate with customer about their data. In addition to be familiar with their data dictionary, we also study their source data to check if the data feature really makes sense to their business. Before we start to create Redshift tables or Tableau dashboard, we have the whole picture of their business problem.
- Amazon Athena: In this ETL process for Wellcome, Amazon Athena is an ideal and important role as an interactive query service that makes it easy to analyze data in Amazon S3 with standard SQL for ad hoc analysis.
- Amazon EMR: Amazon EMR is ideal for problems that necessitate the fast and efficient processing of large amounts of data. To leverage Amazon EMR, aggregate all the data in Amazon S3 as data lake, decouple the computing and storage area, ensure no data loss in accident, and choose the right instance type which achieve cost-effective.
- Amazon ElasticSearch: Amazon ElasticSearch automatically stores the original document and adds a searchable reference to the document in the cluster’s index. With Amazon ElasticSearch, we can leverage Kibana, which is integrated into Amazon ElasticSearch for analyzing the data ingested into the Amazon ElasticSearch domain.
- Amazon Kinesis: Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. Amazon Kinesis offers key capabilities to cost-effectively process streaming data at any scale, along with the flexibility to choose the tools that best suit the requirements of your application. With Amazon Kinesis, you can ingest real-time data such as video, audio, application logs, website clickstreams, and IoT telemetry data for machine learning, analytics, and other applications. Amazon Kinesis enables you to process and analyze data as it arrives and respond instantly instead of having to wait until all your data is collected before the processing can begin.
- Amazon QuickSight: Amazon QuickSight is a fast, cloud-based BI service that makes it easier to build visualizations, perform ad-hoc analysis, and quickly get business insights from our data. Amazon QuickSight’s serverless architecture and easily scale your insights with your growing user base, while ensuring you only pay for usage with Amazon QuickSight’s unique pay-per-session pricing model.
- Amazon Redshift: Amazon Redshift is a fast, scalable data warehouse that makes it simple and cost-effective to analyze all your data across our data warehouse and data lake. Amazon Redshift delivers ten times faster performance than other data warehouses by using machine learning, massively parallel query execution, and columnar storage on high-performance disk. We can load the structured data into Amazon Redshift and implement the business logic.
- AWS Glue: AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easier for customers to prepare and load their data for analytics. You can create and run an ETL job with a few clicks in the AWS Management Console. You simply point AWS Glue to your data storage on AWS, and AWS Glue discovers your data and stores the associated metadata (e.g. table definition and schema) in the AWS Glue Data Catalog.
- Increase retail analytic capability (market basket)
- Shorten the data processing time by leveraging Redshift’s massive parallel processing (MPP) power
- Fasten analytics process with Tableau for achieving business intelligence insight
- Reducing ETL time from > 20hours to 1.5 hour
- Access of timely reports and dashboards at any time, anywhere, any device