Interview with Doug Stein, (Principal) MemeSpark LLC

Key Points:

Personalized learning is a growing interest in K-12.
See work DreamBox learning is doing, for what they are doing, SCORM, even CC breaks down, they are using formative assessment.
Adjustable difficulty, level of support, based on learner
Could be supported by providing learner profile, performance information on past SCOs, learning experiences
There has to be some way of letting cohort data influence behavior of SCOs.
This should be preprocessed, can’t put too much computational burden on the SCO.
Not all SCOs will be able to use all learner profile elements, needs to be extensible
Three cases to consider:
- mixed instruction and assessment with adjustable difficulty
- mix of instruction and assessment objects, customized daily playlist
- bank of simple questions, used to detect level
There is lots of interest in recommendation engines.
- currently no standard packaging for pathing data
- currently no standard to describe cohort data
Should support adaptive assessment as one extreme, using simple questions to measure level.
SCORM does not have robust representation of assessments.
- need to support question bank, tracking questions used
- should look at IMS QTI
Don’t require a browser session.

Would you please tell me a little bit about your background?

Sure. My first really deep engagement with SCORM was as the VP of development and technology at a company called Learning.com. We adopted SCORM, actually we licensed the SCORM engine from Rustici, and built it into our platform. It was very successful, and allowed us to deploy a multi-product platform, where we could separate the content development process from the learning management process. I was the VP at Learning.com for four years, and left in 2007. We built a platform for instruction and assessment; it’s now branded as a digital learning environment. So it’s a lot simpler, same with the K-8 market, a lot simpler than the heavy-weight LMSs. Definitely saw some benefit there.

I’ve spent the last four years doing strategy and technology consulting in the EdTech industries. I work with publishers who are trying to become online companies, online companies who are trying to scale because they’ve outgrown what they built on their seed capital, and companies that have interesting R & D they’re trying to commercialize. So that’s my interest in education, kind of a long-term interest. I make my living by making companies successful, helping them get through their problems.

There’s another aspect to my work. My primary outlet for pro-bono efforts as a consultant is through the SIIA, the Software Information Industry Association. I serve on both their Technical and Development committee and their Personal Learning committee. Personalized learning is a growing interest in the Education space, in the K-12 market, certainly. It’s shown up as an important part of the National Education Technology Plan that came out of the Dept of Ed very recently. So there’s a lot of push towards not just delivering content where you have to take this course for compliance. Think about the origin of SCORM, I always felt that it was to lower the cost of airplane manuals, when there were only going to be slight differences from one engine to the next.

The initial emphasis of SCORM was about being able to assemble content in different and interesting ways to build courses. But it was always assumed that everybody was going to go through the same course, would have to hit the same competencies. Then there’d be some branching based upon choices made within the course content, and there’d be simple checkpoints, embedded assessment, but fairly simple checkpoints. I don’t want to slander SCORM, it’s a very valuable set of problems — it’s a very valuable solution to a very discrete set of problems. As you move into markets with different needs, SCORM has been stretched quite successfully. But a part of Project Tin Can is that you’re feeling the stretch and there needs to be a look at SCORM — Do we need to make changes to it so that it can better address the problems that have been coming out over the last ten years?

So, my interest is that I’ve been working with a couple of companies that are doing personalized learning. I mentioned one of them in one of the ideas I posted, a company called Dreambox Learning. They take an approach to personalized learning where the way SCORM is currently designed, tends to break down. Even something like Common Cartridge, which is much more explicit about: We have instruction, we also have assessment, we also have hooks to high value external systems. Even IMS Common Cartridge would break down. The problem, from an educational point of view that SCORM will have trouble with, is that the boundary between what is instruction and what is assessment gets fuzzed and broken down when you start doing personalized learning. It’s much more of a fluid transition between, “We are teaching you something,” and “We’re measuring what you understand.” The education people talk about formative assessment, where the act of assessing is one of the means to actually cause the student to understand. So it’s not so much an assessment of what they’ve learned, but an assessment to make them learn.

By asking the right questions that sparks thought?

By asking the right questions, by detecting the teachable moments in real time. If you think about what SCORM does with its sequencing model, right now, it’s an a priori sequencing model. It’s designed, authored, and thought through by the content experts in conjunction with the instructional designer. So you map out the pathways, put the decision points in. Companies like Dreambox do that as well. They start with expert teachers, they build the branching, but then they tend to do two things in addition to that. One is, the nature of the question or the content that is delivered is not fixed and static. You can think of it as a knob dialing up the volume or dialing up difficulty. At the point you deliver a particular screen or lesson, you may deliver it minus certain supports, because you have detected that the student doesn’t need them. Pedagogical supports, like a legend or help texts, or “show me how.” You might dial those back in if you detect the student is starting to have difficulties with the concept. So even if you decided statically that, “I’m going to go to this piece of content,” the difficulty level is dynamic. There almost has to be a way to get a parameter into the finest-grained SCO to be able to say something about the learner and what. Think of it as pre-parameters, in a C program. In other words, if you run it with no options, you get one behavior, but you have a way of getting command-line switches in to modify the behaviors. Similarly, there needs to be a way to inject facts about the learner so you can modify the behavior of the SCO, if it’s been authored in that way. And SCORM doesn’t really have a good way of doing that.

You’re saying there will be a need for both some sort of learner profile, which would record how well that student or learner had done in previous courses, or any other way you’ve assessed their learning along the way, even though they’re packaged, but then also as you move from SCO to SCO, you need to keep track of some metric of how they’ve done there, and possibly on a set of different competencies, not just one.

That’s correct. So that’s one aspect, which is to modify the SCO based on learner profile, what you know about the learner. And then, related to that, I’m not sure what the right implementation is, a good general implementation, but in some sense you need to also modify the SCO based upon cohort analysis. Because it’s not just the learner and what you know about the learner, that gets back to your a priori model of what you think learners should be ready for. But usually what happens to these adaptive systems when they’re deployed is you have thousands of people using them, you start to detect patterns. The way to think of it is the educational learning equivalent of Amazon, which creates a dynamic offering. You go to buy a book and they say, “For this price, you could get these two books together and save five dollars.” And that was done on the fly; they looked at patterns and said, You know what? People that bought this book are also interested in buying this book, and maybe we can get the sale of both books if we offer them together at a discounted price in a bundle.

So there’s an evolution of the a priori model, it’s called a posteriori analysis, which means after the fact, after you have deployed the SCO, after it has been used by large numbers of people, you start to detect patterns. You can modify the learning paths, or you can modify the difficulty levels, because you’ve done some segmentation and prediction. Because you’ve had enough data that you can group learners and then having grouped them, do you want to say, Hmm, I should try this next. It’s not so much modifying the profile, but it’s the profile plus some aggregated cohort data, both being used to modify the SCO. And again, this is not something that people have to author into SCOs, but there has to be some way of letting that cohort data influence the behavior of the SCO.

Right, and that could be — there’s a question of how much crunching of the profiles can the SCO do on the fly? But, one–

I don’t think they have to crunch profiles so much as there has to be a way to inject the distilled information. I see it as hard to make it portable if you start imposing on SCOs the requirement to have significant computational power. I think of it as, can I initialize the SCO with learner profile information that the SCO can then adjust and set up knobs. We’ll set the preset because the bassist plays jazz as opposed to rock. It’s like presets to organ stops. He’s gonna play a particular piece, so you push one button and get all the stops set right for that particular kind of piece. You have a particular learner with a particular profile, so you set the difficulty level and the parameters that the SCO was pre-authored with, if someone chose to author it that way, so it knows a given kind of learner. And then it’s almost like there’s a second call, periodically, to say, “Do I want to modify this based on cohort data? You told me about the learner; is there any further information you want to give me at this particular point based on your digested cohort data.”

Is that something you’re solving now?

It’s something we’re solving now, but they’re doing it without standards, which means they’ve been responsible for all the tooling. Which is a strain on an early company. You can’t get outside authors because you can’t get robust tools, because they’re home-brew and they don’t have the time and money to polish them to the point where you can just hand it to people to crank out content. And that seems to me like what the learning world was before SCORM. You didn’t have a standard where some people could make really great tools, and other people could make really great content with those great tools, and other people could make platforms to run them. So, they’re stuck in that world of having to do more of the pieces themselves, because there isn’t a standard out there.

They are someone you would want to talk to, their title’s Director of Personalization, but he’s responsible for the engineering team for the equivalent of the SCORM client, in other words the runtime, that’s detecting what the students are doing so you can capture rich data. What they have is a process where teachers can author, not school teachers but people who have been trained as teachers, and who are experienced educators, they combine content expertise, domain expertise and instructional design. They author the pathways visually, they build the free parameters into the lessons, it’s all around virtual manipulatives, so it’s very rich item types, and they publish this lesson. So that’s they’re job; they think in terms of: when I see this behavior, here’s how I should adapt the lesson, so they get profile information and the lesson modifier dials the difficulty based on learner profile. Then it gets modified further by having cohort data injected. They maintain a much richer data set about the learner than you would typically think of in SCORM.

Related to that, I think in terms of a standard, you’ll think in terms also of having extensions to the XML — something that if you don’t understand it, you don’t respect it, than just pass it through without modifying it. It’s data that you respond to if you care about it, so that would be a standard way of putting an extension in an XML. There’s a set node of the tree for stuff that’s pertinent to your product, that’s semi-proprietary, or it’s an extended profile, it’s an extension. Process it, or ignore it. If you know how to process it, do it; if you don’t know how to process it, ignore it and pass it on through.

So, I guess some of the learner profile information, some SCOs wouldn’t know how to process it, I suppose, and that would — but they wouldn’t need to pass it through. Are you also thinking that the LMS, when we’re sending data back, would basically be that there would be some data that we’re defining as standard format for that data, but that the LMS might not need to know what to do with that data, except to store it and then respond back with it when queried.

That’s correct. So the idea is to enrich data if you can, and to transmit it unchanged if you can’t. But not to strip things out.

Is there a way you would see the LMS enriching the data, or another component in between the content and the LMS?

Yeah, if I think about a world where you are building both more instructionally-focused SCOs and more assessment-focused SCOs, and trying to build bigger units that blend them — Let’s say you have a nifty little SCO that’s effectively five screens and it teaches a single concept. And somebody else has built a small adaptive quiz on that level. The LMS might take the information and it might have its own model to say, “If I see this kind of behavior, I need to offer a choice of these SCOs. So instead of directing the student down a hard path, I will give them the choice of three SCOs that would be useful for them to do next. So this is basically putting a recommendation engine into the LMS. That’s the place where an LMS could be engineered to add value, as in: Hey, I have these SCOs, there’s sequencing, but as we all know, a lot of the SCOs are built so that you take the course, or the lesson, and then you stop. And that’s it. And somebody like a teacher has a — in a school environment, someone who is using Blackboard might have 300 SCOs for a freshman calculus course. One per book-lesson. And, the SCOs, that’s the highest level SCO SCORM would be operating on — we teach, teach, teach, test, test, test, then we’re done. And the decision about where you go next was pre-determined by the professor who picked, out of the 300 SCOs, that he wants you do chapter one, lesson one, and he wants you to do for homework tonight chapter one, lesson five. So an added value in LMS would be to take outcome data generated by one of the SCOs, and be able to again, use cohort data, and make a recommendation saying, “Well, you did well on this; here’s three things people have found useful next.”

Or, if the SCOs are purely instructional or purely assessment, there’s — Joel Rose at the School of One — and they’re a really part of, they’re funded out of the New York City school district, and he’s established this school that’s trying to use a very different instructional model, and it’s very technology-driven. And the way to think of it is that you have a series of instructional objects, that you hold in a large pool of them, and some of them are lessons and some of them are assessments, and every night based upon what students , you crunch the data and build a customized playlist for each and every student. The student does that, the next night you crunch the data and the idea is to take students who are all over the map and drive them all towards competency through whichever pathway works for them. And so there’s a lot of interest in recommendation engines, which have up until now, been proprietary things built into LMSs, usually under the names of “authoring personalized learning paths.” But currently, it tends to be different from one system to another. If I wanted to move my course from Blackboard to Desire to Learn, all that learning path information, I’d have to rebuild it because there’s no packaging mechanism for it. There’s no way to, there’s no standard around describing cohort data. At some level of recommendation engine, there’s a graph or adjacency matrix, which should be something that could be encoded in a fairly standard way. But you basically want to say, here are the transitions and they’re not coming from inside the SCO. These are inter-SCO transitions that are coming from outside based on cohort analysis. And they may be different from one learner to the next.

The other people that are probably worth talking to is an education company, NWEA, Northwest Evaluation Associates, and their whole business is adaptive assessment. They don’t do instruction, they have humongous item banks and they have a psychometrically valid vertical scale and they, when you take a test, you don’t know how many questions you are going to get. You sit down, and if all they know about you is that you are fourth grade, they start you at a sort of neutral level for fourth grade, and you get the question right or you get the question wrong, and then based upon that, they pick the next question and the next question and the next question. So your value on this vertical scale is standard error gets down small enough, we know where you are.

So that’s another, School of One, and NWEA are some companies which have scenarios that would be, they would love it if there were a standard so they could break apart the authoring process from the deployment process.

Right, and then they wouldn’t actually have to design these decision points and —

Exactly, so just to recycle, and make sure I’m clear: You’ve got a pure assessment company that has adaptive, very simple questions, and that’s one extreme you want to support. Think of it as SCOs that don’t necessarily want to teach much, but just measure. Another point is the School of One where you have instructional objects and assessment objects, and some, for all that we know, do a little bit of both. They have embedded formative assessment. But the purpose of the system is do the customized per student playlist, and update that on a daily basis. So in their model, the student comes in and they’ve got things to do on their list, they work through their to-do list, and then the next day they come in and get a new to-do list. It’s basically based upon trying to drill down and figure out where does this student need to grow and where do they not need to grow. And you’re moving away from a seat time based model of learning to a competency based model.

Then, the folks like Dreambox, instead of doing it in an overnight batch cycle, they’re trying to do it in real time. At the moment, we see that you’re having difficulty, we dial up the supports, simplify the questions, figure out where the weakness is and update the profile, and the cohort data in real time; Although they take time to crunch the cohort data to modify the pathways of all of them. Those are three scenarios where personalized or adaptive learning would benefit from SCORM having a wider notion of sequencing and initialization data for a SCO and what gets reported back to the LMS.

You’ve talked about some of these engines for determining what the next step should be for the learner based on previous performance and the cohorts’ performance, if when defining APIs around that, should we assume that that is a service of the LMS, or just that that is a service that is provided outside the SCO, that could be provided by the LMS or could be provided by another purpose-built service that just does that?

Yeah, it’s a good question. Take IMS LTI and its place within Common Cartridge: They glom together the three standards. They glom together a packaging and sequencing service, which for all intensive purposes is SCORM. Plus IMS QTI, for assessments as opposed to instructional modules. And I think they did that because QTI has a much richer notion of what is a question, is much more structured than just a floating point score, or floating point objectives, or set of objectives with floating point scores. It’s more the notion of an assessment has a bunch of questions, the questions are of discrete types, like seventeen discrete types, and they try and make it very straight-forward to render and to interpret. The third standard they put in is the LTI, which is: how do I interact, how do I have my LMS interact with an external system, and they’ve always been kind of — that’s the vaguest part of the standard. I think they’ve been most fuzzy on the use-cases there, but the two use-cases that pop up, the one that’s used the most is high-value publisher content behind a pay-wall. So, the LTI is used to jump out to peer center Thompson get out the book that you had to pay and get an unlock code for. So being in the course gives you the LMS access and then if you went to the bookstore and paid $20, you got the key code that unlocked this pathway that used LTI that gets to the publisher’s site where they have the book. So that’s not the most interesting one except to publisher’s.

The other one they mentioned a couple times is adaptive learning systems. But they don’t really have a standard or have clear examples of how those tie in to an LMS. The best scenario I’ve seen is that the LMS is not adaptive content and then partway through you want to do something like NWEA, so it would be the model of saying, “At this point we’re going to have a test. And the test is going to be adaptive and the LMS doesn’t know how to do adaptive tests. So we’re going to have an escape hatch out here where you do your adaptive test and then the score will come back into the LMS.” So that’s kind of the direction they’ve gone, but as you can imagine, it puts more than integration burden on the sponsor of the platform, the customer of the LMS. They have to have two systems and they have to talk to each other.

There is, and I don’t know how — I think there’s enough consensus on some of the simpler cases. The three cases I’ve mentioned of NWEA pure assessment, adaptive; School of One built a playlist. And sort of the more real time that Dreambox is doing should be able to subsume those into an LMS platform.

I guess just because there’s an API defined — I guess we start with the assumption that it could be external, that precluded being part of the LMS; but what I’m thinking about is, if people have an LMS that they haven’t upgraded in a while and the want to tack this functionality on to give them potentially some way to do it.

That would mean having targeting information as to what system is going to respond. You wouldn’t put an IP address, for instance, you wouldn’t put a URL, but you’d name something that says: “This package is intended for whoever handles adaptive assessment.” If there’s nobody then you don’t offer it. If there’s somebody, then the LMS says, “Well, I don’t do it myself, but I know where to route this.” That might be upgrade paths for a LMS that wants to play with an enhanced SCORM standard but doesn’t want to build all the functionality that the SCORM standard supports. It’s kind of delegation mechanism saying, “Here’s a request for the service that does this. I don’t handle this service; I can forward it on.”

Were there any other major topics you wanted to talk about?

That was the major thing, but I have three ideas I added to the Tin Can forum site. “I want adaptive courses with embedded formative assessment” was how I tried to capture this much deeper discussion that we had. The other two which I supported, but didn’t write, one was the one — “SCORM does not have robust representation of assessments”. What IMS did is they kind of stuck QTI to the outside of SCORM, they wrapped a bungee cord around them. QTI is a good example of a more robust notion of what is an assessment. Right now, assessments in SCORM, they don’t really feel like assessments. They feel like — think of the difference in a text book for the check questions in the middle of a chapter and then maybe some longer check questions that are end-of-chapter questions, as totally separate from the mid-term.

So, you’re saying that SCORM doesn’t provide a concept — what people tend to have within a course is more like those check questions right, and less like the mid-term?

Right, and the QTI in particular, and again I haven’t checked the very latest release, but they are to the point where they have seventeen kinds of items. What is a question? It starts with true/false, to multiple choice, to short essay. They have two that are very open-ended, I think sixteen and seventeen are a Java applet and a Flash SWF. So those are kind of open-ended, an escape path when you can’t figure it out. There was at least an attempt to give more structure to outcomes than merely a set of floating point numbers with a scale on them. Which is what’s in SCORM.

With objectives, sure.

Which are objectives, so there’s the question of how do you map an assessment to whether an outcome has been met. But I don’t think people think of it that way, they about saying, Here’s the assessment, the assessment might feed into multiple competencies or multiple outcomes, but I would like to — almost everybody that you talk to wants to drill down and see what did they get on this particular question.

When I was at Learning.com, we were using SCORM to capture outcome-data, and we did that for lessons and we could have embedded checkpoints. But when we went to build a psychometrically valid assessment, we had to not use SCORM. You had to go to something that was structurally the same as QTI. We needed to be able to capture responses, not just, how did they do? When you think about any kind of environment that does assessment, there’s a couple of things; first of all, an assessment is made up of questions. Questions are drawn from the item bank; sometimes it’s random draws from the item bank. The items, you have to keep track of how many times they’re used and what stage they are in. Because the lifetime of an assessment questions goes from data, which doesn’t mean you don’t know whether technically it’s going to work, but we don’t know whether it provides discriminatory power, if it provides you with anything. In simple cases, the question that everyone gets right doesn’t tell you much. And the same for the question everyone gets wrong. So you look for discriminatory power. When you build an assessment, you dribble a few questions in at any given time that are of unknown quality. You don’t count them towards the kid’s score, but you do measure the outcomes and then do the analysis. And then it gets promoted to a production question, and it gets randomly put into assessments. And sometimes the draw is 100% of the item bank, if you have a small item bank everyone gets the same test. But generally it’s done as a random draw or some kind of interesting draw from an item bank, and after it’s been shown enough , it gets out there and people write about it. The Kaplan test prep programs start to mention it, and it loses discriminatory power because everybody’s been coached, so you retire it.

If you don’t have data — and the data is not just what was the outcome, but I have a question — think of the case where multiple choice questions with four distractors — you have choices A, B, C, D. I need to record whether you put in an A a B a C or a D. Because that’s part of the data when you analyze a question. Does everyone go to the same wrong answer? Are the distractors interesting? Do they tell me about people’s misconceptions? So, to do a robust representation of assessment, you have to have a more structured notion of what a question so that you can get more structured meaning out of the outcome data that the learners generate.

And SCORM would provide, or does in interactions, it provides a way to track answers, but it doesn’t, and they can be tied to objectives for — which gives you the numeric sense of the final outcome, but then you can put an ID on them to potentially tie them back to a particular question, if you have a lookup for that ID internally. But then there isn’t a concept of a question bank, or a way to do any of the other things you were talking about. The data could be recorded, but then you would need another system to crunch that data. You would need something to pick some valid questions and some questions that we don’t know yet if they’re valid or not, and that’s not really provided.

Right, and what I’d recommend is to look at IMS QTI and say, “Can we do something like this or better, or even more?” And embed it into a SCORM object, into a SCO, as to having it sit alongside the SCOs. Common Cartridge basically says, you got your lessons, your assessments, and the two together make a course, along with hooks to any external systems. Even if you don’t have an item bank, even if you are pre-authoring a fixed set of questions, can you give me sufficient structure that I’m not having to link things with IDs and what not, that I know that I’m injecting into the course, that what appears or what is authored into the course appears, is a five question quiz that has this structure and the information that QTI is offering, which is much more information about the kind of question it is — not necessarily how it’s rendered. There are many many ways of rendering a multiple choice question, but it’s basically the concept of multiple choice and this is one of the seventeen item types. That gives me a lot — it makes it much easier for the LMS on the back end to pick it apart. You can still ignore the details and sort of report top level objective scores, but it’s much less work for the author of the content to say what kind of question they have.

Again, if you look at a lot of the LMSs, they embed quiz-creation tools. It’s one of those areas currently where it’s not part of SCORM and they build quizzes and don’t necessarily export to other systems. It used to be the case, up until recently, that the latest version of Blackboard, they finally implemented Common Cartridge, so you can export from Blackboard. The thing against Blackboard was you could author in it, but it was always a big risk because you could never export from any other system and have fidelity. And so they kind of had to go and implement Common Cartridge. And there’s nothing wrong with that. Common Cartridge is trying to subsume SCORM and maybe SCORM needs to try and subsume some other IMS standards for different purposes.

The third one is much more minor. It was about requiring a browser session. Come on, think about it. If you wanted to deliver as an app on an iPhone, it’s not going to be a great experience trying to fire up the browser. If you want to use Silverlight, in other words, a rich internet client. It was always a bit of a head-scratcher for us, when I was at Learning.com, we built using SOAP based web services for the communication between the piece of Flash in the client and the LMS. And SCORM brought a lot of benefits, so we implemented it, but it felt so kludge-y, having to have these frames and javascript. It just felt like a pre-web service interaction mechanism. Sort of about time, whether you use HTTP Request objects or not, there’s gotta be a better way to shovel information back and forth without loading the pages.

Yes, it is, and I think we all get that at this point and it probably has less votes than some other items, because everyone is just looking at that and kind of correctly saying, Well of course that’s going to be on the final list. There isn’t anyone saying that SCORM should continue to require a browser session.

That’s why I left that for last, that’s an easy one. So the two then, to make adaptivity and personalization possible, and as a step towards that, to be able to have a robust notion of assessment.

Interview with Doug Stein, (Principal) MemeSpark LLC

Related posts

ADL funding news won’t really impact SCORM

The World Just Changed

Tin Can First

Ben Clark