Ptak Associates Tech Blog: May 2017

Tuesday, May 23, 2017

IBM and Hitachi collaborate to make a historic deal!

IBM and Hitachi have announced today that IBM will supply customized IBM z system mainframe technology running the Hitachi VOS3 operating system. Hitachi will supply the system to customers in Japan. Existing IBM services such as IBM Cloud Blockchain (available on Bluemix) will be available. Clearly, it is no exaggeration to call this a “historic agreement”. Here’s why.

This agreement benefits both companies as it marks a historic first for IBM in sharing its mainframe technology. For Hitachi, the agreement means that the company will be able to offer the latest in large system technology with additional features and specifications specifically targeted to its customers. In turn, IBM acquires a strong technology partner to collaborate with for future mainframe development.

Hitachi customers benefit because they have assured access to world-class technology without having undertaken any conversions. Continuing use of the Hitachi operating system preserves existing investments (typically quite significant for large system customers) in applications software, staff training, and experience with Hitachi’s operating system.

The economics for both companies are quite compelling. Hitachi is spared making the massive investments needed go-it-alone in the large systems market. IBM strengthens and broadens its commitment to Open innovation on the mainframe by extending their existing open source software and cloud standard support. It also broadens the market for z Systems technology already in-place and supporting the world’s top 10 insurance firms, 44 of the top 50 global banks, 18 of the top 25 retailers and 90 per cent of the largest airlines. This promises to be a win-win for all involved parties.

Kudos to Hitachi and IBM for this well thought out collaboration.

BMC’s Mainframe Research Survey – Open until June 4th for your input!

The world of mainframes is radically different from what it was just a few years ago. This is due to some significant changes in the mainframe itself as well as to an explosion in the number and type of products and tools used to manage and operate it.

Automation of tools along with simplification and speeding up processes for operations and maintenance play a big role. These have made the power and unique advantages of the mainframe both simpler and easier to access to the benefit of experienced mainframes and those new to the platform.

Especially important in expanding access to the mainframe to a whole new generation of potential users were the efforts of some forward thinking, very smart vendors who made available numerous mainstream DevOps, open source and emerging technologies that attracted new users even as the products facilitated changes to well-established patterns of management and usage for the better.

Expanding interest in and application of machine intelligence, Big Data, mobile computing, security, Blockchain technology and more meant increased the need for such mainframe features of high reliability, built-in security, high availability as well as the ability to handle heavy loads of processing of large databases. The combination of these has significant impact as they influence and change where and how the mainframe influences and is being in the enterprise.

All this activity is of interest to the entire mainframe community. For the past 12 years, BMC has conducted an annual mainframe survey to examine the state of the mainframe. As is typical, each survey provides insight even as it is raising additional questions. As a result, every year's survey has new questions to explore new topics and drill down for more details in specific areas.

BMC is at it again. Looking for your inputs and opinions on the state of the mainframe within your organization as well as personal viewpoints, concerns, etc. On May 18th, BMC launched its 12th Annual Mainframe Research Survey. This survey is one of the largest in the industry, reporting on some of the most important mainframe usage trends. Several new questions and topics have been included this year. This year they want to know more about the types of workloads you run on the mainframe, what work is growing and what is shrinking. What new workloads and methods are employed to manage them. You have until June 4th to complete the survey.

Have at it! Take the survey now.

Monday, May 22, 2017

Red Hat on Cloud Migration and DevOps

By Bill Moran and Rich Ptak

In the recent Red Hat Stair Step to the Cloud[1] webcast, Red Hat Senior Manager, James Labocki discussed issues in Cloud migration and software development. As usual for Red Hat webcasts, he presented good information about the changes and problems (for DevOps) driving migrations, as well as describing how Red Hat can assist customers interested in migrating to a Cloud. We’ll start with James’ review of IT business performance issues including some that should concern anyone involved in IT.

What’s driving organizations to become software companies?

Today, it is a competitive necessity for enterprises to use the internet as a sales and information channel to customers. Proliferating smartphones and mobile devices make it critical for organizations to compete and service customers using internet communications and mobile apps. Software’s role and influence in business operations drives organizations to act as software companies. In turn, this raises the question of just how well are IT departments meeting the demands of various lines of business (LoBs). Are they effectively delivering the software that meets business needs?

Based on a 2012 McKinsey study[2] report, Red Hat reveals that IT is not performing well. The average IT project is both way over budget (45%) and delivered late (7%). IT fails to deliver expected value – yielding significantly (48%) less than forecast. Abandoned, uncompleted projects mean that resources poured into them are wasted. Thus, actual losses may be worse than these statistics indicate.

Separating software from non-software projects, McKinsey reveals software-based projects perform much worse on certain metrics. On average, software projects are 66% over budget and 33% are late. Even more startling, some 15%, of large IT projects (greater than $15M in the initial budget), went so far off track that they threatened the very existence of the sponsoring company.

What’s behind these challenges? How might one deal with these problems?

Before discussing Red Hat’s solutions, we want to highlight a major management problem. We recommend reading McKinsey’s[3] report for additional insight. The report’s indication is that in many companies, senior managers are neither knowledgeable about nor involved in IT related projects. Therefore, they avoid participating in project decision-making. This is very risky, especially considering the following:

A percentage of IT projects will go disastrously wrong (i.e. black swans in McKinsey terminology),
Aberrant projects can potentially bankrupt a company, and
Chances of a disaster increase dramatically with project length.

The obvious recommendation is:

Senior management must not only be aware of but also involved in major IT projects, with a special focus on long-running projects. Regular checkpoints and on-going management attention[4] is mandatory, starting prior to launching any serious project.

This is NOT to say senior management should make technical decisions. They MUST be involved and provide oversight to avoid serious problems in the process, administration, impact assessments, tradeoffs, etc. and issues that call for overall managerial expertise. This deserves more attention and discussion that is beyond the scope of this overview.

Red Hat Cloud solutions and services

Red Hat’s overview of IT operational challenges and problem areas underscores the need for an IT project partner. They underscored that conclusion with examples of customer success stories detailing how Red Hat products and services provided substantial assistance leading to a successful Cloud migration. The webcast focused on two Red Hat Cloud solution offerings, Red Hat Cloud Infrastructure and the Red Hat Cloud Suite of component products.

An overview of the Red Hat Cloud Infrastructure includes:

Figure 1 Red Hat Cloud complete infrastructure and tools (Chart courtesy of Red Hat)

Red Hat’s Cloud infrastructure includes a complete set of tools and infrastructure to meet enterprise needs. This includes platforms for data center virtualization, private on-premise cloud, as well management of those and leading public cloud providers. This allows customers to seamlessly run applications on virtual machines across a hybrid cloud. Whether for a private cloud, public cloud or hybrid configuration, Red Hat appears to have the experience and assets needed for customer success.

The Red Hat Cloud Suite:

Figure 2 Red Hat Stand-alone tools and solution suite (Chart courtesy of Red Hat)

The Red Hat Cloud Suite builds upon the Red Hat Cloud Infrastructure by including the OpenShift Container Platform. This allows organizations to develop, run, and manage container-based applications in a hybrid cloud. We will provide an overview of Infrastructure problems along with selected Red Hat solutions. A detailed product review exceeds the scope of this paper.

Problems with Infrastructure and cloud migration

Red Hat divides infrastructure issues into two categories; those associated with the existing in-place cloud (or physical servers), and those of a new cloud environment. For existing infrastructure, Red Hat identifies two major problems:

Slow delivery of applications
The sheer complexity of managing dynamic virtual machines (VMs) and systems environment.

In today’s hyper-active, highly competitive business environment, LOB management demands apps be frequently updated. New apps must be deployed into production on an ever more rapid schedule. An existing environment with a multiplicity of VMs (let alone one consisting of physical servers) becomes increasingly difficult to manage.

Red Hat Solutions

Red Hat software tools address these problems by:

Helping to speed up slow application delivery by automating the creation of environments for developers,
Reducing infrastructure complexity by adding a management platform able to optimize the environment.

When developers request a new test environment, they need it ASAP. It is untenable for IT Operations to take days to satisfy system requests. Red Hat’s catalog-based automation allows developers to select from pre-defined, pre-configured environments. This saves time, sometimes cutting system delivery times from days to minutes. Of course, other bottlenecks may exist that slow down application delivery. Red Hat has, in-hand and in development, a rich set of tools to help identify the causes and aid in resolving them.

Complexity problems with existing infrastructure can frequently be solved through appropriate management. For example, a shop can have multiple VMware VM’s on premise, along with VMs in private and/or public clouds. Red Hat’s Cloud Management Platform, CloudForms, was designed to manage the mixed environment. Now, we look at some challenges of a new cloud environment.

Cloud Environment Challenges

Red Hat identified two problems frequently facing IT departments. One, mentioned earlier, is IT unable to quickly respond to LoB requests. The cause may be in resource allocation or even just getting solutions out-the-door. IT development and operations are not agile; hampered by processes that are out-of-date or even non-existent.

Another serious problem occurs if IT must operate with infrastructure that is incapable of handling or changing in response to a changing workload. The infrastructure, processors, storage, network, etc. are unable to scale to meet demand.

Red Hat’s Solutions

Such cases call for more extensive and far-reaching change. First, we describe what Red Hat recommends, then discuss additional concerns. The recommendations for:

Non-responsive, not agile apps – IT needs to modernize development processes to replace monolithic apps so that logical portions of the app can be independently updated,
Infrastructure unable to adapt to changing workloads – IT needs to change from a scale-up environment to scale-out environment.

As mentioned, these require extensive changes. If the existing design mode is for monolithic applications, clearly Red Hat’s solution to #1 is worth investigating. Otherwise, we add a word of caution.

What Red Hat is recommending requires much more than installing some software. It amounts to a fundamental change in development methodology. It involves creating cross-functional teams to assure that applications turned over from development to operations will be robust enough for quick release to production. All this requires close coupling of operations (and their needs) with and across development team activities.

Red Hat does indicate their recommended change has implications that will ripple across the enterprise environment. The most obvious are budgetary and headcount impacts. There may well be resistance from existing developer (and operations) staff to such change. We are not saying Red Hat is wrong. (We think they are correct.) We are saying it is necessary to understand and evaluate the effort and cost to implement this recommendation.

With respect to switching from a scale-up to a scale-out environment, while correct for many cases, it isn't a one-size-fits-all solution. Again, it depends on the environment. For very large systems, the cheapest, more logical solution may be to increase capacity through better load management or scheduling. In other cases, scale-up may be too expensive to undertake. We'd recommend a study to identify and evaluate alternatives. companies facing this choice should investigate carefully.

Summary

Red Hat is a very experienced, successful open-source company. They recommend careful consideration when choosing an open-source versus proprietary solution (that may cause lock-in problems in the future). They believe most customers would prefer and benefit more from open-source solutions.

We agree with Red Hat that companies should use open-source when it meets their needs. Red Hat, correctly, points out integrating multiple open-source products can be challenging. In such cases, companies may decide to use proprietary solutions if no integrated open-source solution is available.

As decades-long promoters of open-source, we believe that it makes good sense to evaluate Red Hat and other open-source solutions before rushing to a proprietary solution.

Red Hat’s discussion of migrating to the Cloud provides a very helpful review of problems that companies may encounter. Red Hat has a fine reputation for their products. Companies will not go wrong to consider their products and solutions. We look forward to hearing more from them. In this work, our goal was to provide some background to assist companies in making their evaluations.

[1] View the Red Hat Stair Step to the Cloud webcast here: https://tinyurl.com/n2rgezr

[2] Along with other valuable insights, the survey data used by Red Hat is here: http://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/delivering-large-scale-it-projects-on-time-on-budget-and-on-value

[3] Available here: http://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/an-executives-guide-to-software-development

[4] The referenced McKinsey papers contain many other suggestions. Our recommendation is the first step to avoid the worst disasters. Organizations need to develop their ownC-suite executive best practices.

Friday, May 19, 2017

IBM PowerAI on OpenPOWER system eases and speeds access to AI, Deep and Machine Learning

By Rich Ptak

Image courtesy of IBM, Inc.

IBM launched the latest additions to its PowerAI Platform [1] for AI, Deep and Machine Learning with an analyst briefing session (which we attended) along with presentations at the NVidia’s GPU Technology Conference (GTC)[2] in Silicon Valley.

PowerAI is an integrated software distribution of open-source deep learning frameworks (TensorFlow, Caffe, Torch, etc. – see graphic at right) running on an IBM® Power System™ server. It is targeted at Data scientists both experienced and just getting started as the face some serious entry roadblocks due to the amounts and variety of raw data they work with and existing modeling processes. These roadblocks are addressed in IBM’s new release. Here’s what we learned along with our opinions about the product.

The difficulty to be overcome

Three significant tasks have frustrated Data Scientists working in Deep and Machine Learning. One was the effort required for data extraction, transformation and loading (ETL). The second was the time-/effort-intensive manual process as models are refined, trained and tuned for optimal performance. Massive effort and time were spent at transforming and loading diverse data types into Data Lakes and Data Stores able to feed into existing analytic and modeling tools. Finally, were the manual processes to train, revise and optimize standardized industry-focused models to fit a specific operational model the Data Scientist was building.

PowerAI was initially announced in November 2016. IBM set out to make PowerAI the leading open-source, end-to-end cognitive application platform for Data Scientists and developers. The three goals set at PowerAI’s very beginning were that it should be:

Fast and easy to install and deploy (in hours) using common frameworks;
Deliver optimal performance so it was designed with frameworks and libraries for tuning to achieve peak throughput;
Tuned for unique performance and scaling by using the latest and emerging hardware architectures and technologies, e.g. GPU, CPU, interconnection, etc.

Thus, simplification, ease of use and adaptability were key design goals. The plan is to use automation and integration to deliver a set of tools needed by data scientists and experts in basic tasks such as data transformation, model building and optimization. Becoming a leader in deep learning required more than focused product activity by one vendor. IBM recognized the need for an outstanding system platform, an integrated open source stack and a dispersed open-sourced ecosystem of partners, suppliers and innovators. IBM also participated in efforts to pioneer and support many of the best practices used in deep learning today.

The hardware underpinning the software stack is the OpenPOWER system announced last fall at the first OpenPOWER European Summit [3] (Barcelona, Spain). It is the IBM Power S822LC for High Performance Computing (HPC) system. Joined with the NVidia pioneered CPU-to-GPU NVLink, the server is currently the best-in-breed system for deep learning projects.

The OpenPOWER Platform system has proven to be popular with developers and users around the world. We describe the system and chip specifics in earlier articles. See our discussion on the system at: “The newest IBM Power Systems – more of everything for the hottest environments![4]”, and the chip at the heart of it all: “Acceleration, Collaboration, Innovation - IBM's roadmap for its POWER architecture [5]”. Also, key to building the success of PowerAI are existing partnerships and more to come with industry leaders like Julia, DL4J, Apache Spark, Anaconda, DL4J (DeepLearning4J), OpenBLAS, ATLAS, NumPy, docker, etc.

Additions to PowerAI

Figure 1 The PowerAI Software Stack Image courtesy of IBM, Inc.

Figure 1 illustrates the PowerAI software stack. The new additions include new data preparation and transformation (ETL) tools that automate, speed deep learning capabilities. New cluster orchestration, virtualization and distribution capabilities using Apache Spark. Tools to speed and automate the development process for data scientists. Adding up to faster training times (for model building and validation) by distributing deep learning (processing) across a cluster.

The result is a multi-tenant, enterprise-ready Deep Learning platform that includes:

AI Vision – a custom application development for use with Computer Vision workloads;
Data ETL accomplished using Apache Spark;
DL (Deep Learning) Insight (Automated Model Tuning) – automatically tune hyper-parameters for models using input data coming from Spark-based distributed computing with intuitive GUI-based developer tools that provide continuous feedback to speed creation and optimization of deep learning models;
Distributed Deep Learning – HPC Cluster enabled distributed deep learning frameworks that automatically share processing tasks to speed results while also accelerating training (of models) with auto-distribution using Spark and HPC technology from TensorFlow and Caffe.

In summary, PowerAI uses automation, integration and AI methodology to speed and simplify the whole process of model building and testing. It provides Apache Spark-based data extraction, transformation and preparation tools for Data Scientists with extensive experience in Deep Learning. It provides automated, distributed model tuning and testing to speed the overall process by eliminating tedious manual comparison and analysis. Experienced and entry level data scientists will benefit significantly from these tools that simplify data preparation, model building, testing and tuning.

IBM offers PowerAI on-premise today and it will eventually be available as an IBM Cloud service. They expect that for a variety of compliance, security and capacity reasons, most users will opt for the on-premise solution. Basic PowerAI capabilities are available for free. Enterprise extensions are for fee with support/consulting services available.

The Final Word

With this release, IBM has convincingly speeded up and simplified major tasks associated with data ETL and manual processes for model training, tuning and optimization. The results benefit both experienced and entry-level Data Scientists working in Deep and Machine Learning.

From the beginning, IBM has worked to build and maintain ties with the larger open source community. They continue to expand the size of the community, cooperating with major players to integrate new technologies and capabilities.

With the introduction of the PowerAI Platform and OpenPOWER server, IBM stands at the forefront in providing an integrated toolkit and platform as a comprehensive entry way to AI development for data scientists from mid-range to full-enterprise sized organizations. At the heart of the needed ecosystem is the open source deep learning community and association with such open source communities as that surrounding OpenPOWER. We encourage you to look more deeply into what PowerAI is offering, both now and in the future.

[1] https://www.ibm.com/blogs/systems/nvidia-gtc-deep-learning-dispatch-day-1/

[2] https://www.nvidia.com/en-us/gtc/attend/on-site/

[3] https://openpowerfoundation.org/openpower-summit-europe/

[4] http://ptakassociates.blogspot.com/2016/09/the-newest-ibm-power-systems-more-of.html

[5] http://ptakassociates.blogspot.com/2016/06/acceleration-collaboration-innovation.html

Thursday, May 18, 2017

Integrating Syncsort Ironstream® + Compuware Application Audit™ Connects Big Iron to Big Data!

By Rich Ptak

As the amount and variety of data being collected across the enterprise have skyrocketed, so has the

necessity to provide reliable, speedy, broad-based, low overhead, secure transformation, and delivery of that data. Syncsort has been a leader, offering tools and solutions to move and transform data for nearly 50 years.

Big data collection, transformation, processing and timely intelligent analysis to actionable insight continues to be one of the major problems facing enterprises today. In particular, there has been a major bottleneck due to the inability for real-time access to mainframe data.

With the announcement of the integration of Syncsort Ironstream with Compuware’s recently announced Application Audit, the situation is significantly altered. As a result, enterprises can now obtain a comprehensive, in-context view of enterprise operations, faster, more securely and more reliably than ever. Here’s our view.

The immediate issue

Mainframe operations have always generated enormous amounts of data. This includes critical data to inform on everything from the basic operations and interaction of the infrastructure, applications, monitoring, testing, abnormal events, network operations and interactions, user activity to accounting. The list goes on and on.

Today’s business and enterprise must function in a world that is a dynamic, fast-paced, high-volume mélange of tightly integrated operations. It is a world of immediate client and user access from internal and external, remote and mobile sources and devices, some of which may very likely be the result of “black-hat” events aimed at disruption, destruction or theft of assets.

Multiple surveys, including Syncsort’s own “State of the Mainframe for 2017 [1]”, place operational security and compliance mandates at the top of enterprise executive concerns and objectives. The high cost of successful data breach attributable to non-compliance with a mandate, inadequate security or failure of a trusted asset drives an increasing focus on embedded security.

The complexity of today’s operating environments requires a comprehensive view of operations and status. That view is built using data collected from a staggering number of different topics and devices, including mainframe and distributed infrastructure.

There exist numerous SIEM (Security Incident and Event Management) and analytics engines designed to draw actionable conclusions and results from the data. But, that data even after collection must go through the ETL process before it is fit for analysis. There existed major challenges in both the speed of the transformation and access to the mainframe data that resulted in untenable delays and complications for getting actionable information from the data in a timely manner.

Syncsort’s Big Iron to Big Data Strategy

As we said, Syncsort has been working with customers for nearly 50 years. Their recent focus has been on delivering solutions targeted at providing Big Iron data to Big Data platforms for next-generation analytics. The emerging enhancements and extensions in machine learning applications are rapidly impacting enterprise operations. Fast, easy, reliable and timely access to data collected by the mainframe has become a critical issue. Timing and processing challenges involved in moving data on- and off-platforms is also becoming increasingly problematic.

Existing ETL and transformation efforts were too fragmented and scattered. Depending upon analytic tool vendors to develop their own interfaces not only made handling the data more complex but also delayed solutions coming to market. Syncsort’s Ironstream addresses the problem of speedy transformation and reliable delivery.

Integrating Syncsort’s Ironstream with Compuware Application Audit changes the dynamics of the ETL process. Compuware Application Audit collects the data on mainframe users, Syncsort Ironstream® makes the transformation of that (and other mainframe data) to deliver machine data in real-time to Splunk® Enterprise Security (ES) for Security Information and Event Management (SIEM) analysis.

Although currently Syncsort Ironstream works solely with Splunk, they are working with selected customers to test data transfer to data environments like Hadoop. They provide an open-ended trial using the free Ironstream Starter Edition for moving z/OS Syslog and Abend-AID data into Splunk® Enterprise.

What is unique about this technology trial is that there is no time limit on the organization’s use of the Starter Edition to move data. The Starter Edition is only limited in the range of data sources.

Syncsort has one customer moving 2-3 Terabytes of data per day, but most are currently 1TB or less. They estimate that for the 100 largest accounts, there is an average of 10TB of daily data that could be useful sent to Splunk and undoubtedly the usage will grow along with the awareness of valuable use cases.

The Final Word

Syncsort told us that their goal is to be the ‘go-to solution’ for moving mainframe log data from anywhere to anywhere. Their description of the problem and future trends confirm our own view of the market. Syncsort’s strategy, solution approach and product plans are well thought out, and if properly executed, will assuredly advance Syncsort toward that goal. They appear to have well-satisfied customers. We intend to follow Syncsort, perhaps even speak with some of their customers. We’ll report what we find. For now, if your mainframe security and operations are enterprise critical, we’d recommend investigating what they are offering.

[1] See: http://www.syncsort.com/en/Resource-Center/Mainframe/eBooks/State-of-the-Mainframe-for-2017-An-Annual-Survey

Thursday, May 4, 2017

The Machine = Memory-Driven Computing from HPE

By Bill Moran and Rich Ptak

Image courtesy of HPE, Inc.

Originally announced in June of 2014 [1], HPE recently provided additional details [2] on the new

architecture titled by HPE as “The Machine”.

Computer industry observers, as well as developers have many questions about this new architecture. This article describes some of the “The Machine’s” key features. We also attempt to address exactly what this new system might mean to the industry. We also offer some recommendations on how IT departments should react to this new system. We do not attempt a deep dive into the technology. First, a little history.

Von Neumann Architecture

HPE has been working on its new architecture for several years. Their goal has been to solve problems posed by the end of the von Neumann computing era.

Von Neumann architectures were built on the concept of a central processing unit or CPU. All data reached the relatively fast CPU through much slower functioning memory. CPU speeds, along with the cost of the memory itself, meant systems used specialized I/O devices such as tapes, disks, flash memory etc. to hold the data to be processed. Until very recently, it was prohibitively expensive to build memories large enough to hold all the data for processing.

In addition to being cheaper than memory, such I/O devices were non-volatile. This meant they did not lose all that data when power was turned off as the memory would. Obviously, losing data along with power, whether intended (a shutdown) or unintended (power failure, device failure), was untenable for a working datacenter. Another challenge was the constant movement of data. Data often comes from a variety of distributed I/O devices, moves to a central store, and then to computer memory for processing, then moves back to either storage or another I/O device.

Moore's Law

Moore’s law explains why this architecture worked for the past ½ century. Technically, the law (more precisely an observation) relates to the number of transistors on a chip. It states that this number doubles approximately every 18 months. However, physical limitations appear to mean this law is becoming invalid. (You can’t keep shrinking things indefinitely.) With the law’s end, we can no longer expect microprocessors to drive performance improvement as they have in the past.

However, the requirement for increased processing power continues to grow, driven in part by the sheer amount of data to be processed. Other factors driving growth include an increasing number of devices, more complex data types, the internet of things and an expanding world population. Studies report that the world’s data doubles every two years, with no sign of slowing down. The demand for the ability to process this giant mound of data continues to increase.

The Machine’s Memory-Driven Architecture

As noted, the center of the von Neumann architecture is the CPU; large data sets must be split up to fit the relatively small amount of main memory that is tied to each processor. The much faster, cheaper memory now on the horizon allows for the emergence of a new architecture with memory at the center of the system. This new non-volatile memory (NVM) means data is preserved even if/when power is off. All data is kept and accessible in greatly expanded memory.

HPE intends to use these emerging NVM technologies to form a new shared pool of “Fabric-Attached” memory, which means that any processor can access any byte of data directly, without having to work through another processor. In addition, this fabric enables processors of many types – x86, ARM, GPU etc. – to communicate over the same fabric. This allows Memory-Driven Computing to match a workload with the ideal processor architecture, allowing tasks to be completed in the shortest possible time using the least amount of energy. For communication over distances longer than about a foot, Photonic/Optical communication links are used to allow physical components spread over a wide area to perform as if they were all located in the same rack. In addition to improving performance, this delivers breakthrough energy efficiency and design freedom.

This structure will have several additional immediate effects. First, it eliminates the requirement for I/O devices in normal processing. A major cost saving. Next, all data is accessible by normal CPU instructions. This simplifies application programming with no need to manage and access data on external devices. Processing speeds increase dramatically as all data is immediately accessible for processing.

Migration Considerations

A major concern when introducing a new architecture is the difficulty of application migration to the new platform. Since most existing programs were created in a world of limited memory, they will not automatically take advantage of the new architecture. The algorithms that they depend on will have to be rethought to use the larger memories now available. We have had a taste of this type of effort resulting from the recent adoption of in-memory databases.

HPE is trying to make it as easy as possible to get started by using familiar programming languages and constructs. HPE has invested considerable resources into the support of the new architecture even to the extent of creating an optimized version of Linux [3] running on a commercial System on a Chip (SOC) which is currently under development by a partner. They have an extensive program of support and information to help developers and programmers understand the new environment. This includes specialized software tools and services [4].

Structure of the new Datacenter

When fully implemented, the new architecture will drive a revolution in datacenter design. Full adoption will take many (10-20?) years. For an extended period, applications designed for the current technology will have to coexist with the new architecture apps. Ripping out all the current I/O devices is simply not possible for numerous reasons, i.e. expense, risk, incompatibilities, poor or non-existent application documentation, etc. Another factor is the time it takes to build confidence in a new architecture as developers and other staff work to acquire expertise and learn its idiosyncrasies.

Performance Considerations

As mentioned above, The Machine’s new architecture offers the opportunity for many existing applications to be optimized to use what is essentially unlimited memory. Note that the new architecture does not require changes to programs, but changes are needed to achieve improved performance.

HPE modified some example code. They reported between 10X to 100X faster performance speeds. These impressive numbers were achieved fairly easily by optimizing existing algorithms for the larger memory environment. Further, more radical changes in the algorithms can achieve far greater speedups. A financial modelling example reported a speedup approaching 10,000X. However, this involved completely redesigning the application.

The performance story has two sides. On the one hand, radical change can yield stunning results, as seen in the financial model example. On the other hand, merely moving an existing unmodified application to the new environment will not automatically deliver great or, indeed, any performance improvement.

Economic Considerations

A key factor in determining the speed of new technology adoption is cost. We lack sufficient data about HPE’s new technology to make any cost estimates. Obviously, the cost of the new memory will be key. In the best case, the new technology would be price competitive with, as well as performance advantaged over existing technology.

A Possible Action Plan

We think most large IT installations will benefit from devoting resources and time studying HPE’s new architecture. Here is why. An IT department wishing to keep its most productive people must demonstrate that they are forward looking. No one wants to work in an environment where skills become obsolete because the organization fails to track new technology. This alone justifies some investment. A second, more compelling reason, is the potential for immediate payback. Many installations run applications developed when small memories were common, and where a lack of memory is a major cause of slow performance. Evaluating applications to assess the potential benefit from access to large memories can yield significant insights to the potential for investment payback.

Of course, these efforts must be closely monitored and controlled. The adage; “If it ain’t broke don’t fix it” still holds. We are not recommending a major “rip and replace” mindset. On the other hand, there can be no progress without replacing old technology. HPE’s “The Machine” has the potential to leverage major performance improvements in today’s applications. Determining how much to invest requires a case-by-case evaluation; such effort can be fully justified.

Conclusions

We applaud HPE’s investments in this architecture. Wisely, they are cultivating a supportive ecosystem with multiple efforts intended to facilitate access, educate and involve [5] the greater IT development community with The Machine. Our conclusions are as follow:

1. The Machine introduces Memory-Driven Computing which is a radically new architecture optimized for data-intensive processing.

2. It effectively addresses problems associated with the collapse of Moore’s Law.

3. The architecture works and indications are that it can provide impressive (up to 10,000X) performance improvements for a certain spectrum of application types, when programmed appropriately. Modifying existing applications can yield more modest, but still substantial, 10X to 100X improvements.

4. HPE has assembled an impressive array of tools, products and services to encourage IT staff, especially developers to become familiar with the architecture by means of easy, inexpensive (sometimes free) access to the technology.

5. We recommend looking at this technology; it could change your future!

To date, we have seen no other approach offering as comprehensive a solution to the “end of Moore’s law” dilemma. It will be several years before a final judgement on this new architecture will be possible. Today, we are optimistic about HPE’s prospects.

[1] See https://www.labs.hpe.com/the-machine

[2] See https://news.hpe.com/hewlett-packard-enterprise-demonstrates-worlds-first-memory-driven-computing-architecture/

[3] See https://www.hpe.com/us/en/newsroom/news-archive/featured-article/2016/06/Hewlett-Packard-Enterprise-Puts-The-Machine-In-the-Open.html

[4] See more at: https://www.labs.hpe.com/the-machine/the-machine-distribution

[5] See https://www.nextplatform.com/2016/04/21/programming-persistent-memory-takes-persistence/

Ptak Associates Tech Blog

Pages

Tuesday, May 23, 2017

IBM and Hitachi collaborate to make a historic deal!

BMC’s Mainframe Research Survey – Open until June 4th for your input!

Monday, May 22, 2017

Red Hat on Cloud Migration and DevOps

What’s driving organizations to become software companies?

What’s behind these challenges? How might one deal with these problems?

Red Hat Cloud solutions and services

Problems with Infrastructure and cloud migration

Red Hat Solutions

Cloud Environment Challenges

Red Hat’s Solutions

Summary

Friday, May 19, 2017

IBM PowerAI on OpenPOWER system eases and speeds access to AI, Deep and Machine Learning

The difficulty to be overcome

Additions to PowerAI

The Final Word

Thursday, May 18, 2017

Integrating Syncsort Ironstream® + Compuware Application Audit™ Connects Big Iron to Big Data!

The immediate issue

Syncsort’s Big Iron to Big Data Strategy

The Final Word

Thursday, May 4, 2017

The Machine = Memory-Driven Computing from HPE

Von Neumann Architecture

Moore's Law

The Machine’s Memory-Driven Architecture

Migration Considerations

Structure of the new Datacenter

Performance Considerations

Economic Considerations

A Possible Action Plan

Conclusions

Pages