Notice: This Wiki is now read only and edits are no longer possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.
Production Ready Task Force Work Page
This page is for communicating thoughts and ideas on works in progress to the workgroup.
Feedback from SNIA PRTF (2008/09/29)
I (Martine) had a good meeting with the SNIA PRTF today and we went over my previous presentation looking to get specific priortisation from SNIA on our items. Here's the list:
- SDC Slides Review -- I'll be sending out some slides later today that Paul will be presenting to the Storage Developers Conference regarding best practices; he'd really like our feedback on them.
- Spec Reviews -- Paul really likes the idea of us spending time going over specs. We tried to get some feedback with a recent management application TWG ballot, but no one in Aperi responded -- we should try to figure out why and if there's something we can do to make it easier to get feedback. For example, perhaps people would feel more comfortable talking about a certain profile or component rather than providing general purpose feedback as emails?
- Element manager behaviour -- This fits nicely with Aperi-Lite as a way to provide tooling/components that match SMI-S functions for element managers. As part of doing Aperi-Lite, we'll need to figure out such things as performance, coherency, caching, etc. All of this is useful for SNIA to know about.
- Events -- we can look at eventing (life cycle indications, alert indications) and how they match up with client needs to drive ideas for next generation eventing/indications.
- Scalability -- Paul is very interested in the relative improvements of changes like view classes vs non-view classes, etc based on real-world client workloads.
- Stress Testing -- Not a high interest item in Aperi (due to the work involved in configuring and running the tests, but we can contribute some code that runs a real-world workload.
- Host Profiles -- this is tied to the Aperi-Lite as one of the components/features to implement. Paul says that anything that helps new profiles is a great thing...
Presentation for SNIA
Here is the presentation for SNIA: Aperi_PRTF.pdf
This material was presented to the SNIA PRTF on Monday June 9. Here are some notes I took from the meeting:
Tightening Performance Test
- They're looking to potentially allow some backward-incompatible changes after SMI-S 1.5 (call it 2.0).
- They're really looking for clients like Aperi to provide a perspective on things that are missing from the spec and need to be addressed (we'll get this again later).
Specification Churn
- Lots of discussion regarding what we're really talking about here. Things that are changing for no reason vs things that really need to be fixed. My thoughts are that we want to invest in making the appropriate changes to tighten current functionality over things that just expand capability.
- SNIA is looking for Aperi to weigh in on how the model should be extended such that it has minimal impact. For example, we can imagine a series of best practices or rules that enforce the right way to extend a model without breaking tools
- Aperi should examine and report back to SNIA on items in the specs that have or will likely cause grief. From this we have concrete examples to reinforce the message.
Specification Lagging
- The suggestion here is to poll Aperi membership to determine what is missing from SMI-S that they'd like more work done on.
- Priority of lagging spec is MUCH lower than Spec Churn; there was some concern that these are counter to each other (rightly so), however I did mention that the priority tells the real story here.
Element Management
- There was good reaction to the idea of an Aperi-Lite for SMB that would do element manager functions for storage configuration management (e.g., list volumes, provision, zone, etc.).
- There was concern that a lot of a true element manager is outside the domain of SMI-S, so might have minimal value (e.g., reboot, load firmware, error logs, etc.).
- There was a suggestion that Aperi look at point-tools for things; the example that came out of the discussion was a clone utility that would allow you to query a storage box (e.g., array) and then recreate that configuration in another box (volumes, pools, hosts, zoning, etc.).
Embedded CIM Agents
- This one kind of surprised me a little bit. I have been thinking that embedding is the "natural" way to go for so long, I didn't expect to get any pushback. It turns out that many folks in SNIA do not believe that embedding is the best way forward.
- The reasons include:
- A bad implementation of a proxy will make a really bad embedded solution.
- We need to determine the base rationale for embedding and address those directly (e.g., ease of maintenance, availability, performance, etc.).
- Customers don't want to upgrade firmware; but they don't mind re-installing a new proxy.
- Aperi needs to articulate the specific benefits that that embedding has; what problems are solved, and express the means that we will mitigate some potential negatives.
Event Management
- Lots of discussion about the ills we've inherited with the CIM indication model.
- Primary fault seems to be directed at the means that we define CIM Indication filters; Using CQL for filters is awkward at best.
- Suggestion is to throw it all away and re-do it for SMI-S 2.0
- Where Aperi can help is to look at how Aperi would model events; what do we want to see and how would we like to see it (driven by use cases). Look at competitive event delivery schemes (SNMP came up) and see if they have values that should be brought into SMI-S.
- A tool that they'd like Aperi to consider is an analytic tool; something that listens to indications and provides reports. The idea is that they can review the reports to find out that providers are not sending the right indications, etc.
Clients Coding to Vendor MOF
- This was somewhat new to the SNIA folks; they didn't realise the scale of the problem.
- Lots of suggestion of ways to tackle it -- public list of vendor extensions, tagging mechanisms in MOF for extensions, etc. -- but not that much that Aperi can help with.
- One area Aperi may be able to help is through tools to identify vendor extensions; but these are really developer tools, not client tools so I'm not sure how that fits within out scope.
Scalability and Performance
- They'd like to see Aperi establish benchmark numbers for various vendors; e.g., "Vendor X1 probes in Y1 minutes; but Vendor X2 takes Y2" They are also looking for Aperi to provide performance goals such as "returning n instances of CIM_Volume in m seconds"
- They'd like to see Aperi provide general guidance for how operations/algorithms should be done for querying/fetching information.
- One idea that had a lot of traction in the meeting was for Aperi to host in it's open source repository modules that would satisfy various SMI-S use cases (e.g., code for collecting Volumes and associating them to hosts; code for creating n new volumes from a storage pool).
Stress Testing
- They agree that Olocity might be the best approach; this task was considered too complicated to take on in a part-time fashion (e.g., you need to worry about workload, device configuration, use cases, etc.).
Config and Setup
- Again, the clone a device tool came up as something Aperi can contribute. Perhaps by merging the Aperi SAN Simulator technology with the SRM tool to "play back" the configuration.
- There is a discover fest coming up; they're trying to determine if they should kill SLP. Aperi may want to weigh in.
- Aperi can provide feedback regarding what they want to do with a CIM Agent. Use cases, capabilities/features, performance, etc.
Host Profiles Lagging Adoption
- There was definitely some interest in Aperi doing some HDR or other host support; however, it seemed that the main topic in the conversation had to do with SMI-S capabilities that are missing for managing internal storage.
Bet the Farm On SMI-S
- Need better marketing
- SNIA is very much in favour of the idea of componentising Aperi and creating a portfolio of tools such as Aperi-Lite, device clone, indication reporter, etc.
- The tools don't have to be big tools; they can be small but directed tools that solve real problems.
Known Issues
- Specification/SMI-S Level Issues
- Consistency -- Different implementations tend to do things slightly differently even if they don't have to. How can we tighten up the spec to define better the expected behaviour? High priority. SMI-S needs to get back-to-basics and tighten up the specification some more. Next steps would be to research what the variation between vendors/providers really is and use that to propose some real feedback.
-
- Backwards Compatibility (both within SNIA and for vendor extensions) -- How to ensure that products that use these standards don't end up painted in a corner where they can only work with one version at a time, or can't handle an upgrade of a component, etc. Medium-high priority. We need to come up with a way to register versions of vendor extensions and allow the client to determine up front if it's compatible with a given version of the provider.
-
- Lag between device function and SMI-S -- Is somewhat contradictory with earlier requirements/requests for avoiding spec churn and backward compatibility. How do we manage the conflict? Low priority (must not put backward compatibility at risk). Perhaps we can leverage a registered vendor extensions publication mechanism within SNIA. The idea is to publish model concepts that we're working on and use that material as foundation for new profiles and changes to profiles.
-
- Spec changes too quickly/much -- This makes it difficult for exploiters to code solutions as everything becomes a moving target. Medium priority. There needs to be a stable core of functions that get extended and sometimes that's difficult to agree on. Also, we need to stop declaring victory too soon.
- Technology
- Element Managers using SMI-S -- What kind of changes do we need to be able to allow element managers to get the data they need quickly enough to handle a dynamic real-time customer environment. High priority (we're missing the boat here!). We need to look at mechanisms to provide real-time transfer of data (scatter/gather, push/pull, etc.).
-
- Embedding vs Proxy -- Seems like embedding makes for better, more stable infrastructures but isn't always possible? High priority. Customers don't want to have to setup servers and configure providers. Our solution should not make the problem space bigger. Provide ratings (consumer reports style) for usability/installability rating. We should participate in install-fest. Suggest a logo-programme specifically identifying embedded (powered by embedded SMI-S?).
-
- Event Management -- Indications has some problems, we should fix it? Medium-high priority. We need to research what we really want to get out of indications, what they're used for, and determine when they should be used. For example, an indication telling clients a minor change and an indication telling the client that the box is on fire is treated the same within the cim protocol.
-
- Client software coded for SMI-S vs client software coded for specific device CIM model CTP needs to have a reference implementation of providers for client testing. This would ensure that the client software at least supports base SMI-S functions. More draconian measures would require a specific MOF to be used by both providers and clients -- e.g., no specialisation of classes; vendor extensions must be provided in seperate namespace, or perhaps just separate classes...
-
- Multi-tennant CIMOM Go embedded! :-)
-
- CIMOM Coexistence on Server Platform specific, not really SMI-S solution -- relates to the distribution, platform, and implementation.
- Implementation/Adoption
- Scalability and Performance -- like the stability item, this becomes important when we try to get customers to really use this stuff. This is pretty important, recent data from ScaleFest indicate that the best providers delivered about 50 instances/sec which is clearly not fast enough to support 30,000 to 60,000 volumes. We should create benchmarking clients to evaluate the performance and publish the results.
-
- Stress Testing -- how to we ensure that we've met stability? Recommend leverage/require Olocity.com for independent third-party review -- kind of like UL approved.
-
- Configuration and Setup -- Setting up and configuring the CIM software. How do we make it easier for customers to deploy the software? Should be embedded with no special configuration required; but we will need to tackle it anyway because not everyone can get there right away. Things we can do: Marketting/messaging -- "Aperi works better with embedded providers", or list a set of tier-1 providers (embedded ones). There are some open-source storage arrays, and we can work with them to have embedded CIM Agents and reference them from Aperi. Create/update an Aperi best-practises page to include embedded providers/agents.
-
- Host profiles have been lagging adoption. Lots of OSs are shipping host profiles, but we're missing client exploitation of them. Tonnes of work Aperi can do to advance the host profile adoption. Maybe Aperi can drop host agent support and use SNIA host profiles instead. Core Profiles to implement: HDR and HBA (shipping on a bunch of OSs). AIX server running in tech centre today.
-
- Aperi has no support for SMI-S enabled host based discovered resources; but might be able to extend coverage through work items. Doing this might help gain momentum in space. -- merge with bullet above.
-
- SNIA starting up SMI-S developers group where engineers (open to all) can ask questions, and get FAQ. -- skip.
-
- Installation checklist -- similar to the Configuration and Setup, we need to define/express clearly the requirements to install the software. -- skip
-
- Bet the farm on SMI-S? When to use SMI-S and when not to; what needs to happen to make SMI-S the primary and obvious choice? Make Aperi successful will create competitive pressure to motivate vendors to do more in SMI-S.
-
- Promote Element Managers to use SMI-S. Aperi delegates to the element manager's specialised functionality using launch in context. Aperi has no stake in whether those element managers use SMI-S or not.
-
- Need more client software to support SMI-S -- Like the bet the farm idea, what do we do to increase adoption. New Aperi services/functions will be created as components that can be reused, however existing components have not been componetised and would require work to break it apart. Perhaps we should create a workgroup within Aperi to determine which components should be pulled out and drive the execution. These components can be reconsumed by other projects.
-
- Reliability, Robustness and Stability -- overall problems that need to be addressed in order for SMI-S based management to supersede proprietary implementations. It is not acceptable to customers to have unstable management infrastructures -- skipped, covered in stress test, etc.