Is it possible to get the group (that is the subject) number in sbiofit to do a lookup?

1 view (last 30 days)
If I'm fitting using lsqnonlin, is there a matlab script function that returns the group number?
The idea is to allow use of covariates without having to add false dose columns in the dataset. I could use the group variable to look up parameters in a table, and use assignment rules to set parameters. So I could use covariates to set some parameters while fitting other. As in fitting a PK model with some data, and then using the typical parameter values plus covariate rules to set PK parameter when I do a PD fit using data from another trial/dataset.
One could (perhaps) use the following script to set parameter parA
Repeat assignment rule for parA is parA = getparA();
getparA.m
% [parA] = getparA(); % returns parA
gp = getgroupvar(); % This is the function I'm asking about
if ~exist(parAdata)
parAdata = dataset('parAdata.xlsx'); % columns are subject and parA values
end
[~,idx]=ismember(gp,parAdata(:,1));
idx(idx==0)=[];
parA = parAdata(idx,2);
This would allow one to set any number of parameter values without modifying the dataset. Alternately I could create a copy of the subject column (dummy_subj) and create a species subj_number in the model. Then dummy_subj could be used to put a dose in subj_number, and I could do the same lookup approach.
The point is, once one does (for example) a PK fit to get individual parameters (either with nlmefit or lsqnonlin), there are a lot of machinations needed to actually USE the parameter values in subsequent fits.
Also, nlmefit, one can add covariate effect when one is fitting a specific parameter. But (my understading is that if one unclicks the "estimate" tic box so that the parameter is not fitted, then this turns off the covariate effect as well.

Accepted Answer

Jeremy Huard
Jeremy Huard on 26 Aug 2019
Hi Jim,
the alternative you described, which involves a subject column used as a dose for a dummy species is the best way to get the group number into the ODE system. I have tried it in the past and it works just fine.
I recommend you to define parAdata as a persistent variable so that it's loaded only once per simulation.
As for nlmefit or nlmefitsa, you are right that parameters that are not estimated are fixed with the value specified in the model or variant of that task. Do you intend to perform NLME separately on PK and PD?
Best regards,
Jérémy
  2 Comments
Jim Bosley
Jim Bosley on 26 Aug 2019
Edited: Jim Bosley on 26 Aug 2019
Jérémy,
Thanks, this is good news.
Thanks for the persistent tip - I was wondering about that. Will use this
I have a model of physiology that has a lot of parameters. Setting off a huge (50 to 100 state) QSP model to fit 5 to 20 parameters using nlme techniques involves 1) a pretty big leap of faith regarding convergence, and 2) a pretty big data set. Right now, to fit the PK I turn off all the rest of the model (set all reactions except PK to inactive). This allows popPK fitting.
I understand that ideally, you'd fit everything together. Not having much luck with that. Too many DOFs, too few data in some areas. So yeah, fitting PK first, and then using that pk model to drive the QSP physiology is the current plan. There's also things like endogenous protein that binds so we have to figure out synthesis and clearance rate parameters. I can (with some assumptions) do this with initial trajectory and concentration. I'd like to use those QSP parameters along with the fitted PK model to drive total drug and total and free protein. If I have a good drug and protein concentration-time profile for each subject I'd at least be a bit more confident regarding the fit of some QSP parameters.
I said that I understand the ideal, but my impression is that there's an awful lot of experts that do a PK fit first, at least to get initial guesses for a comprehensive fit. Think of it as a manually operated homotopy method! The homotopy variable is zero for fully sequential fitting which is more likely to converge and will converge more rapidly. The variable is one for a comprehensive fit of all parameters and all trials at once. Per the homotopy paradigm, you start with an easy problem that isnt exact, and approach the hard problem that is exact using incremental increases in difficulty and accuracy.
So it would be very cool if the dashboard had another checkbox for each fitting technique that would be "Use Covariates for this Variable?". This would allow you to use the PK values for covariate scaling parameters from previous fits to estimate subject parameters.
Now, with the technique I asked about and tyat you responded to, one can put in the value determined of exp(theta*covariate + eta) into the parvalue matrix, and not just have the fitting programe use exp(theta*covariate), if you see what I mean. So the lookup table is more accurate than just using a theta scaling parameter.
I think I've discussed the subject of script commands that manipulate the sbproj datasets with Arthur and maybe Ricardo, and my understanding is that there are no publicly released script handles for sbproj datasets. That is, I can't open an sbproj and use a dataset for with sbiofit that was created inside the sbproj using the dashboard. And I can't add a dataset to an sbproj by manipulating the model object in the sbioroot object, correct?
Put another way, the only way to use a dataset in the dashboard is to manually add it, and the only way to use a dataset with sbiofit in a script is to create a group data object in the matlab workspace outside of the dashboard, and use it as an argument to sbiofit.
The reason I ask is that if the above is correct, I may add a data structure to the model as userdata. This structure would contain all of the datasets I used. This would allow me to store my datasets with the sbproj file. So then the paradigm is
  1. use the subject (group variable) column (or a copy) as the dose variable to set a species value to the subject
  2. use a repeat assignment rule (Sietse had a comment on answers stating that the repeat assigment was important - initial assignment didnt' work for some reason) to set the parameter value.
  3. The repeat assignment rule types the parvalue matrix as persistent, and then checks for whether parvalue has any content, and if not it reads the parvalue matrix from the structure in userdata. After that and in subsequent runs, the routine then enters the parvalue matrix using the subject value as index, and sets the parameter value using the parvalue output. On subsequent runs, no need to re-read the matrix.
Besides the feature request for the "use covariates to compute" checkbox for parameters, another couple requests would be a direct simbiology function to get a value from the dataset and set a parameter value with it, and a routine to get groupnumber. So the repeat assignment rule would be
[parvalue]=setparvaluefromdataset(groupnumber(), is_active) + not(is_active)*3.45
where 3.45 is meant to represent a default value. When parvalue is to be determined using a fitting routine, this rule would be set to inactive.
Last, a way to add datasets to sbproj files using scripts. Id want something that I could use after opening a project using simbiology() and not sbioloadproject(). This approach allows me to preserve locational data in the model diagram.
Sorry for the logorrhea. Thanks for your response!
Jeremy Huard
Jeremy Huard on 27 Aug 2019
Hi Jim,
Thanks for your explanation and your suggestions!
A possible way to use covariate expressions from a previous fit in a new fit task yould be to use sbiosampleparameters for instance to generate parameter values using the covariate expressions from your first fit. Then you can save them in a MAT file and load them in your custom function.
Another way to do it if the covariance matrix of random effects is diagnoal would be to define the parameter value with an assignment such as par = exp(theta*covariate + normrnd(0,eta)).
You’re correct, there is no publicly available API to manipulate datasets saved in a sbproj file. That been said, data from the Project workspace and data saved in the ‘UserData’ model property are two different things. The first is saved in the model object, the second is.
So, if you prefer to have your save your parameter values in the sbproj instead of a separated file, you can export the model to workspace from the SimBiology App. You can modify the object and save the project from the App once you’re done. This way the diagram layout in the project file will be preserved.
In your repeated assignment, you could look for your model in sbioroot and load the data from there.
But it’s all more of tricky workaround rather than an elegant solution. In my opinion the best way today remains to save your data in a separated file.
For your suggestion to retrieve the group number, as of today you would nee to create a species say ‘groupID’, add a column to your dataset with the group number at time=0 only and dose that species with this dose. Also a workaround but it works well and is easy to implement.
Thanks for all your suggestions, I’ll make sure they get to our development team.
Best,
Jérémy

Sign in to comment.

More Answers (2)

Jim Bosley
Jim Bosley on 31 Aug 2019
Jérémy
Glad if the ideas help. I enjoy the interactions with you, Arthur, Pax, and others on your team.
I just want to clarify the approach. In my groupedData dataset, I need to add a new column called groupID or somesuch. I should set all those values to NaN. Then I add one row per subject, specify simulation time 0, and add the subject number in the groupID. I use one row per subject to avoid repeated dosing which i would get if I had the groupID column completely populated.
I create a groupID species in my model. Unfortunately, I have to use units like grams or some such because species have to have amount or concentration units. To avoid conflicts being identified, my groupID in the dataset may have units of grams! As an aside, I'm setting the groupID species to zero as a default.
I need a new column (that is, I can't just used the column specifying the group) because 1) the software won't allow me to use a column as both a group variable AND a dose, and 2) I need to have the column blank except for the row at time zero.
I use createDoses to get the dose object containg doses from the trial data, plus also the groupID data for the groupID species.
I write a lookup function that accepts the groupID as an argument and returns a parameter value from a table. In my function, a "zero" argument returns a baseline or typical parameter value. luparvalue(group)
I add a repeat assignment rule that sets the parameter value of interest using the output of the lookup function. I'm using the repeat assignment rule because I recall reading something from Sietse that the repeat rule must be used. Don't understand why an initial assignment won't work, but that's somewhat immaterial. I can turn the group-specific parameter assignment on or off by setting the assignment rule to active or inactive.
So I could create one lookup function for each parameter, and selectivly choose which parameters are being specified on a subject-specific basis and which are allowed to be at their baseline value, by setting the assignment rule with the lookup table to active or inactive.
Question: Can I with sbiofit and sbiofitmixed estimate a parameter that has an INACTIVE assignment rule attached to it? That would be ideal. I may check that.
I just realized something. In many models you are trying to get the modeling machinery to replicate a specific time course. Suppose you have an antibody therapy that modulates clearance of a protein that drives a function. You really want a very accurate time course of protein, so that you can see the correlation between the protein and outcome. So, for fitting protein to outcome you could use the actual trial data with interpolation if sampling is dense enough. A spline (or other function), fit to the protein data would be way more accurate than a PK model form.
You need the PK model for predictive simulation, but maybe not for getting and idea of how the protein modulates the outcome. Hmmmm.
I think that my PK model gives a pretty good representation of the protein - but if not I can use the groupID species and time as arguments to an interpolation function!

Jim Bosley
Jim Bosley on 3 Sep 2019
One last thing. I tested using the group to set parameters. Works fine. So a repeat assignment rule is just
pv = setparvalu(Group)
base_value = 42;
valmat = [ 0 base_value; ... % Example matrix for groups 1, 2, and 3. Group 0 is exemplary: it should never be accessed
1 42.1 ; ...
2 42.2 ; ...
3 42.3 ];
if Group==0
pv = base_value;
else % In reality you'd check to ensure that Group is a member of valmat(:,1). Error handling not shown. [~,index] = intersect(valmat(:,1),Group)
pv = valmat(index,2);
end
The repeat assignment rule is just par1 = setparvalue(Group). Group is a species that has no reactions and is dosed at time zero. BTW, I haven't checked whether constant_value or boundary_condition should be checked for Group.
If you want to use the dashboard or sbiofit to determine par1, just inactivate the assignment rule. Seems to work.
  1 Comment
Jim Bosley
Jim Bosley on 3 Sep 2019
Edited: Jim Bosley on 3 Sep 2019
One last idea. Since one can use can use sbioparameterci with fitresults (the OptimResults object from sbiofit), one could use the parameter estimates and ci's from an OptimResults object as the starting points for subject-speciifc fits. So one could do:
estinfoPK = {set up PK parameters}
fitresultsPK = sbiofit(sm, data, resmap, estinfoPK, dosing, 'lsqnonlin')
estinfoPD = {set up PD parameters}
fitresultsPK = sbiofit(sm, data, resmap, estinfoPD, dosing, 'lsqnonlin', ...
'UsePriorEstimates', fitresultsPK );

Sign in to comment.

Communities

More Answers in the  SimBiology Community

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!