Welcome to the third executive interview about the CPI Card Group case study for the Capstone course. This interview reflects the actual design requirements faced by CPI Card Group in 2015, to provide a context for the data integration assignment and documents in Module 3. The design documents simplify and modify the requirements faced by CPI Card Group. In previous executive interviews, I talked with Tyler Wilson, ERP Applications Manager at CPI Card Group in Littleton, Colorado, USA. Tyler is back for this interview to share his experiences with data integration practices at CPI Card Group. Tyler, tell the learners about challenges, data integration challenges, faced by CPI Card Group in 2015. How many data sources are transformed each night? How much volume of data and processing times are required? Can you describe a prominent data integration task that you have faced? >> Yep, so currently, we're looking at six different source systems that we bring data in from. These are all on-premise systems at the moment. We have really a division on how much data and when it's pulled, based on the system. Right now, three of the systems are pulled monthly. These are a lot of the monthly consolidations that occur in our Hyperion financial systems. The other three are our transactional systems. Each of those systems are loaded every night into our data warehouse. And actually, one of those systems, part of the data is actually loaded every four hours. That's a lot of our production data, so we can get more of a real time view on that. The load that occurs every four hours takes about 30 minutes every time we load it. The nightly loads, they each take about 30 to 45 minutes each, based on each system. The monthly loads are about an hour each. There's not too terribly much data coming in every day. I would say it's anywhere from one to three gigs a day, which over time, it does add up in volume. We've done a lot of work to really consolidate any data coming into the warehouse that we don't want to use, to minimize that impact. One of the biggest tasks that we see with transformations is actually related to something simple but I want to talk about it so nobody misses it, maybe in the future. It's actually related to time. You'll see that each database will store time a little bit differently. It may be time plus date, which is actually a different format than what our time dimension in our data warehouse actually stores. So it's important that whenever we import that data we transport the time to match with the date dimension, in order for it to be usable. And we have to work with that pretty much on every single data source that we have. >> Well thanks for sharing about the data integration challenges. Now I want to focus on the data integration tool that's used by CPI Card Group. What are the capabilities of this tool, or why did you select this tool? How much effort does it really require to use this tool? >> So, the tool that we use is actually Oracle's Data Sync Tool. It came with the package that we bought with our analytics tool. That's part of the reason why we went with it. Being in a traditional manufacturing environment, you don't know, really, how well technology adoption's going to occur. So we didn't want to make a large investment up front on one of these larger ETL tools that are out in the marketplace. So we adopted this tool, moved forward with it, and it's actually served us quite well for what we've done so far. It's SQL based, so everything is SQL driven. It's very lightweight. It actually sits on-premise and it moves data for us from our on-premise applications into our cloud environment, which gives us exactly what we need at the moment. Usability, from a usability standpoint, as long as you know SQL and understand SQL, you can use it fairly well. We're able to establish different jobs or routines within that tool, to serve different purposes based on our data sources. We can have the flexibility to run it at different times and do whatever we need with it. I think as we progress, though, we would like to get into some tools that are a little more graphical, user friendly in that manner. >> Thanks for providing background about the data sync tool used by CPI Card Group. Can you talk about your future needs for data integration, maybe in the next year or so? >> Yeah, absolutely, so part of what we've been doing as we progress and mature with our analytics capabilities is everybody wants information to more real-time data. And I say more real time because it's hard to get true real-time data, especially when you're looking at multiple source systems, and then you add transformations, and everything else on top of that. By the time the data on the transaction is entered, you can transform it and get it to your analytics platform, there's a little bit of time. So we're trying to move towards more real-time capabilities. And I think as we do that, we'll need some more advanced ETL tools. Part of why we stuck with what we did is Oracle provides us a platform to grow. So, we're really looking at considering tools, such as ODI, in the next year. To give us more of an interface that is more graphical in nature to where we're not so dependent on people using SQL and PL/SQL to drive a lot of the changes within the tool. >> Thanks Tyler, for sharing, really, the future plans and the data integration background. I think the learners will really get some pretty good insights about data integration practices from this interview. Let me conclude this interview with some comments about the assignment in Module 3. The data integration assignment in Module 3 is hypothetical, not directly based on the requirements faced by CPI Card Group. It was difficult to develop a closely related assignment because change data access was not possible, and different tools are used. The assignment in Module 3 extends the assignment in Module 5, of course, too. The Pentaho transformations that you will implement involve important features that were not covered in course two. You'll be provided material on these features, to guide your work on the assignment. Another important difference between the CPI Card Group experience in the case study is the population of the data warehouses. Since the data warehouse designe the case study differs in some important ways from CPI Card Group's design, using actual de-identified data was not an option. To populate the data warehouse in a case study, data generation software was developed. The data generation software provides flexibility for time period, data warehouse size, relationships among tables, such as the ratio of leads to jobs, inconsistency, such as the job quantity to subjob quantities. The course website provides files to populate several data warehouses for a more realistic experience for you.