Interview with Dan Pupek, Chief Systems Architect at Advanced Systems Technology

Key Points:

  • We’re doing peer-to-peer interaction within a SCORM-based course
  • It was a mistake to combine several standards in SCORM, instead of keeping them separate
  • We shouldn’t try to solve everything at once
  • Manifest contains: meta-data, content structure, sequencing – too many different things
  • “A lot of the manifests and a lot of the XML formats that people are coming out with, its so heavily name-spaced, it becomes so arcane, that the format is no longer even readable.”
  • SOAP is to heavyweight
  • Should there be a way to communicate course structure via API (not top of my list)
  • Should separate concept of course structure from that of a SCO – multiple lessons, but only one launch URL
  • Security of API should be addressed, ability to test-scoring on the server
  • API could provide publisher-subscriber pattern for communication between learners in multi-player simulations
    • This is a lot more efficient than it used to be
    • Provide a shared hash-table would not be persisted beyond session
    • Team would get its own tracking record, as if it was a learner
  • As bandwidth grows, there will be more real-time video, instructor-led sessions, and more self-paced video also.

 

 

Can you describe your role within your organization and how you’ve worked with e- learning or SCORM?

I am the Chief Systems Architect for Advanced Systems Technology. We provide an LMS that we have developed over the last ten years for a number of clients, the Dept. of Homeland Security, Army Reserves, IFPO, several others. There’s probably about 20 or 30 of them. We focus primarily on the public sector, and safety. We’ve done a lot childcare training, we’re also the first, one of our subsidiaries, Smart Horizon, is the first fully accredited online high school district, accredited by Advanced Ed to issue diplomas, so right now we’re in the process of setting up six high schools, and we’ll be setting up probably a dozen or more by the end of the year. That’s sort of what I’ve done, I’ve used SCORM maybe for fifteen years now; I’ve worked with AICC and all the different iterations of SCORM, and we also developed a content-authoring system that’s web-based that produces SCORM 1.2 and SCORM 2004 courseware. So that’s my experience, and I develop, primarily, web applications in C#.

What are you or your customers doing that you would say is new and innovative in your training and learning?

I don’t know how new and innovative it is, but they’re doing a lot of high level interaction, and we’ve actually done a couple pieces that had peer-to-peer or pseudo peer-to-peer interaction in a single SCORM-based course. Most of what we’re doing with high level interaction is anything that is really heavy on the programming side, we’re doing using proprietary methods, we aren’t really seeing SCORM as an answer to that or anything cutting edge that we do.

That’s good in a way; we’re trying to get a handle on things that SCORM isn’t designed to support.

To me, what’s really important is that a lot of the initiative that ADL is working on, that has been clumped together in SCORM, I think a lot of those things need to be separated out. I think there are going to be a lot of demands for standards in different arenas, everything from sequencing to courseware, to LMS course communications, to course packaging, all of these different standards. I think one of the big mistakes that we solved in the last few years was trying to combine all these standards instead of letting people select what they wanted to pick and choose from. I really think, I think that we need to define, for ADL, they need to define what the problem-domain is that they’re going to solve and not try to solve all the problems that people have.

When we’re looking at all the problems people have, it doesn’t necessarily mean we’ll try to solve all the problems people have. I am looking at what should we support explicitly, and what things can we stay out the way of.

I’ll give you an example of what I’m saying: the IMS manifest, the format of the IMS manifest is much too heavily focused on being the meta-data, the content-sharing, the sequencing, everything in one area. I really don’t think that most people in the industry are as worried about content sharing — like, sharable content objects, after the course becomes a discreet package. In the development arena, yes; you want to be able to share a jpeg or some tags, etc., by the way, those are really problems maybe the ADL doesn’t need to attack. But once it’s become a discreet package, I think the definition of the package should focus purely on what needs to be defined to make the course run. Not how you can break it up into sharable content objects, necessarily.

And by what needs to be defined to make the course run, you mean essentially, where’s the starting point you’re launching into it; anything else you mean by that?

I think I mean manifests could read — I think you could read more like a scripting language than an XML file, or an XML style scripting language that simply defines how the course is expected to interact with the LMS. And that it doesn’t have to — some of the format of the manifest is a little too dependent on name spaces. And I know this is from experience, when people can get, we can overburden ourselves with cool ideas things that seem neat.

Nobody goes to write HTML and worries about what name- space the paragraph tag falls into. A lot of the manifests and a lot of the XML formats that people are coming out with, its so heavily name-spaced, it becomes so arcane, that the format is no longer even readable. It’s too complicated and too muddled with things that don’t do anything for the course. I mean, how much of what’s in the manifest does the LMS really need to know? And how much does the LMS need to know that’s not in the manifest? That’s really what people should be thinking about.

I think it could be separated out, and that way you could deliver something with or without the sequencing. I’m not sure that sequencing is really all that important to most people anyway. It was a way better idea as an academic experiment than in actual implementation. It’s one of those things that has just enough short-comings that people fall back on proprietary sequencing, and single-SCO. That’s what people fall back on. That brings me to the API also, I think a lot of what goes on between the course and the LMS should be moved to the API and away from the manifest. One thing that scares me is looking at a SOAP- based web service. That’s way too heavy for a course. It should be wrapped with, more with, like JSON over the wire. But one thing, if there is an ECMA back-end, like an API object, it should be asynchronous. It should not even be a non-asynchronous version of . For every function or method that you call, there should be a call-back method that you have to wait on that’s the nature of the web.

It might be reasonable to say there will be both a SOAP and REST API, and with regards to the ECMA script and asynchronous nature, courses might not be web- based … There might be a Java-script option, but it can’t be the only option.

I think there should definitely be an over-the-wire API. Whether you go with SOAP or REST, you still will be doing an HTTP API.

Serializing and de-serializing JSON is a lot easier but it doesn’t have the strict contract requirements so your ability to evolve a service is less restrictive.

The strict contract requirements, that’s something I would think could cut both ways, because a lot of people have complained about interoperability and having strict contract requirements and a document being passed with a SOAP methodology, you can validate API calls against a schema, easy. It seems like that would be an easy way to filter out or prove that at least certain things were being done correctly in the API.

You’re still going to fail at the other end, either way. If you send a JSON call, it’s not — you’re still not going to find the failure until you try to call the remote host.

You could have a test tool that could just generate calls and validate them yourself.

You can still validate the structure with a JSON call, it’s just not mandatory, that’s all. That’s the only difference. You don’t have to produce a WSDL and all these other things. I think for most of what we’re doing, it’s overkill.

I think different people will want different things. I would generally say that you want to support the technology that’s going to be quickest and easiest for people to implement. The biggest thing I would say is they need to have a spec for not just the API but they need to define what the spec over the wire is, if it’s going be SOAP or JSON or whatever, that spec needs to be available for people to use. But when you come to the API as well, I think a lot of what — one area that I think a lot people haven’t thought about is that the way that we, the course tells the LMS about its structure is through the manifest, so if there were a way to communicate that through the API rather than the manifest — I’m not sure that that’s necessarily one of the most important things for me, though.

It might be very important in the context that the thought that packaging is separate from communication, if you go down that road, and one of the things along that road is that the thought that you might get communication from content, which has never been important to the LMS. At which point, where’s the manifest? So that becomes important.

One of the problems — and this is another problem with the way the manifest is — when you get a single-SCO course, they put everything into a single SCO, they have 15 lessons in that SCO and how does that — I think the content developer would be more likely to give you a realistic picture of the course if you could separate out those descriptions. There should be some separate, not necessarily in the manifest, but maybe in the manifest, something that defines — Ok, there’s not 500 files in here, there’s one file and it does everything, but here’s the structure of the course. You’re not going to navigate through any of these, but at some point in time, I may give you a score for five lessons even though there is only one SCO.

I see exactly what you mean […], essentially you should be able to report on multiple SCOs even if they aren’t there.

Yeah, or don’t even think of them as SCOs. That’s another bit of terminology we need to get away from, because a lot of people don’t — the shareable content object has not panned out. course, or you should be able to articulate the structure of the course, and then launch the course and it says, He just scored 100 on lesson two, then a couple minutes later it tells the LMS: he just scored 50 on lesson three. But the course doesn’t care what SCO they happen to be on. Sometimes I — that’s why everyone is falling back down to single SCO. They aren’t sure how each LMS is going to treat their structure. And frankly, the LMS developers aren’t sure how to treat SCO structures half the time, the organization. I think breaking it down into discreet lessons or gradable items somehow and then launching the course, and the API sits there and waits for the course to tell the LMS what the person scored on each of those things. If the course wants the LMS to do some kind of sequencing for it, that may be a separate sequencing file. That would have some description of how — but if that file didn’t exist, then the LMS should just assume that it’s going to launch a single URL and let the course do everything. Because most courseware developers would just do that if the LMS allowed for them to identify multiple lessons, but only have a single launch URL. You can give it a suspend data or hash-table or something where it could remember it’s location, and you could even –

It already has the location field. I mean, suspend data should be bigger, considering that you’re reporting across multiple, virtual SCOs, which aren’t SCOs. —

I would say, make a hash-table available, like a dictionary, instead of just — and maybe limit the size of each item in the dictionary, but have it so the course could store “n” number of things in that dictionary. That’s not an unrealistic request these days. it in suspend data and just make a hash-tag available so you could store dictionary items.

….

What else are you doing that’s innovative, or should be addressed?

One of them that I think SCORM could address in some manner, you do a lot of test-item analysis. And we also require that the student and learner not be able to somehow bypass the system and pass the test. So we don’t deliver any tests in any of our SCORM courseware. They’re all delivered through a test facility we have in our LMS. It would be nice if the API had a way to do test-scoring from the server. So inside the course there could be a test bank or question and answer bank that the LMS would have available and when the course gave a test, it sent the answer back to the LMS and the LMS would tell the course whether it was right or wrong.

Right now, and I posted an exploit years ago in a pdf file and people freaked out — there’s a very easy way to pass any test in a SCORM course. Just by sending a few API commands through a URL address bar. So if the score was based purely on passing and failing the test, and if only the LMS server knew the answers and the answer had to be sent back to the server and the server simply sent back true or false, right or wrong, the only way a student could cheat is if they were sending back the correct answers, which they would have to do anyway. Right now, they can just set the score to100, lesson status to pass, commit and finish and you’ve just passed.

What other things were doing that were new and innovative in terms of training?

We had run a pilot for the Army, a SCORM-based course that was a simulation, but it was multi-player, where when you launch to the course, you were interacting with other that were actually actively in the course at the same time, solving problems together. I’m not sure that’s a problem SCORM needs to tackle, though.

This is something we actually have a separate application we developed for hosting it; the cross-domain security problems imposed by ECMA JavaScript did not help. So definitely having a non-ECMA API, like an over-the-wire HTTP API or something like that would solve that problem.

How much of the interaction between different participants needs to go through the server for the course and how much is happening on their own computers?

It’s kind of where getting rid of suspend data and turning it into a hash-table or dictionary, would be a lot better. In a situation like that, it’s like a pseudo-gaming situation; you’re wanting to store and save a lot of data back to the server and so just increasing the size of suspend data, while that might solve the problem, it would also introduce another problem with having to load all of that data to extraneous amounts, all at once, rather than just loading the keys that you request or need at that time, just storing individual keys. We’ve got a set of courseware right now that stores a lot of suspend data as XML. And so every time it stores it, it has to store the entire XML document that defines whatever data it is that the course uses. Then every time we it, it has to load the entire document, even if it’s only looking for one little, one section of that document. Having suspend data in more like a hash-table would solve a lot of those problems for us.

Beyond suspend data there, would there be a need in the multi-player simulations, would there be a need to set data that doesn’t necessarily belong to the learner that the course is currently running for?

You have to be able to communicate with other– that’s where SCORM stops and you have to have your own, the only other thing that SCORM could do to help that would be to have a publisher-subscriber pattern where, sort of like a chat room or web-conferencing, each of those clients when they launch, they would subscribe to chat room of a given ID. So every client that subscribed to that instance there could then have access to the same set of data. Other than that I think you just make it a simple hash-table, with maybe some call-backs. So if client one sets the property for X, Y, Z, it sends a call-back to all the other clients that hey, this property just changed, and give them the opportunity to read it and make a change on their end. Sort of a publisher-subscriber thing so if I’ve got ten people in a course and they all happen to be in the same session, each of those clients would subscribe to some thing that’s available on the LMS. And I think all the LMS really has to make available is a pretty simple sort of a hash-table, and a call-back to all the other clients that some key, or some item in that hash-table has changed, and which one, and maybe who changed it. Whoever is engineering the client itself, to do something with that data.

That sounds like it’s good for getting data between clients about what the other participants are doing; in terms of overall tracking when you have a multi-player activity, do you have any sort of separation between individual performance and team performance you want to store somewhere and is there any place in the API or data model for that?

I think each individual still has their CMI data as well. In terms of overall team performance, I think you let the team — you want to be able to look at each person’s performance maybe different, and the team performance may be different from that. Depending on the model you take, you may want to say that the team performance is everyone’s individual performance, right? It could be, but could not be, too. That makes sense to have not only individual performance metrics track, but a team performance, and that would be tied back to whatever that room is they’re subscribing to, that subscriber session ID.

The way I pictured it is that the team would almost be tracked as if it were another learner, except that will be tagged as the team. And you’d have all the same data model you could fit on that.

And everyone who subscribed to that item there — how that data was set would have to be controlled by the client course or whatever. And then we’re talking about a pretty non-trivial sort of a course. This isn’t your standard page-turner.

It could get to be too much data flying back and forth to be handled in — it might be more than the LMS wants to be pushed through it.

Not necessarily. We have a web conference in software that we work on. The publisher-subscriber paradigm has become a lot more efficient than it used to be. Bandwidth is not as low as it used to be. The ideal application was the computer to computer application, if I want to send you something, I send it to you, not some dispatcher who finds who finds out where you are and give it to you. I don’t think peer-to-peer is as necessary as people used to think it was.

I wasn’t thinking just in terms of network, but if you are actually communicating through the data model that might lead you to write things to the data model you wouldn’t have otherwise, that don’t really need to be tracked or need to be communicated to the other clients, so I suppose there might need to be another way to flag that certain data as really ephemeral.

That’s what I was saying, if you had like a shared hash-table just for that session.

That would not survive beyond the session? Okay.

That’s strictly — if you set a key, then that would simply make a call-back to all the other clients that said, Hey, this key has changed. Even if it’s just instant messaging, you could write a message bumper that says, “Here’s a message from so and so, and it notifies all the other clients and they can decide to disclose that chat message on the screen if they want. And yeah, none of that would necessarily persist between sessions. It wouldn’t be necessarily tied to the performances of anybody, it would simply be — but then again, if the — if it was something you knew wasn’t persisted that you actually mandated as not being persisted, the LMS could be pretty smart about that and just store it in memory or something rather than having it write into a database, etc.

There could be, you would want to be able to both subscribe to the persistent data model changes as well as have a mechanism to exchange data that should not be persisted.

What do you think about e-learning will change or should change in the next five or ten years?

They should all become richer. What will change is you’ll see more instructor-led content that is online, because video technology was so difficult over the web and for distance learning in the past, there was sort of an industry push away from it just for technical reasons, but I think bandwidth grows, you’ll se more instructor-led content with real-time video and you’ll see more self-paced video also. I think video will become really big again. I don’t think video dipears because people didn’t like it, or it wasn’t interactive, I think you can make very interactive video mix-ins, or mash-ups, whatever they call them, that — I think video dipeared because it’s been so hard to deliver over low-bandwidth systems, but I think you’ll see a real emergence in a lot more video.