Is it possible to set watchdogs (memory, exec time) in MATLAB?

22 views (last 30 days)
=== THREAD SUMMARY ============================================================
Please, feel free to edit this summary if you want to and if you have privileges for editing questions!
Question: see below.
Extra: it would be sometimes useful to have runs killed by MATLAB?/OS?/3rd-party? when user-defined memory/cpu-time usage thresholds are exceeded. Of special interest are evaluations of built-in MATLAB functions that lead the system to using swap files (e.g. a basic indexed assignment that makes a sparse matrix becoming a little too dense). This is not a crash (no exception is thrown) but leads to freezing the system, which cannot be interrupted (at least it seems so).
I will send a request for enhancement today, proposing a mechanism that would allow the following (maybe limited to a non-parallel context), because it illustrates better than words what I had originally in mind:
try('exectime', 1800, 'virtmem', 7e9)
...
catch ...
...
end
or
tryOptions.watchdog.virtualAddressSpace = 7e9 ;
tryOptions.watchdog.execTime = 1800 ;
try tryOptions
...
I got a reply on Jan 25th: " Thank you for sending in the exact details regarding the specific enhancement. I did also go through the submission on MATLAB Central, and have shared all the relevant information with the Development Team including the reference to MuPAD's ability to limit the memory used for each session, and we may consider implementing this feature in a future version of MATLAB. "
Synthesis:
  • The Symbolic Toolbox has such a mechanism. It is however based on an engine/product (MuPAD) that is completely separate from MATLAB.
  • Cody must use some mechanism for controlling memory?/cpu-time? It could be relying more on the Linux VM than on MATLAB itself though (open question).
  • ODE solvers options allow setting some limit, but they are specific to these solvers. The mechanism (event based) is likely not to be applicable for controlling e.g. built-ins.
  • Linux would offer tools for controlling MATLAB processes, e.g. ulimit, ps, pmap, kill.
  • Per proposes a solution based [see code below] on dbstop (which is I guess what Jing had in mind). I extended it a little [see code below], but it is still limited and won't get e.g. valid operations leading to swapping.
  • The extension of Per's proposal could use an undocumented feature mentioned by Benji here for managing memory: feature('memstats'). This can be parsed, e.g. like this to get the 3 'In Use's:
T = evalc('feature(''memstats'')') ;
str2mat(regexp(T, '(?<=Use:\s*)\d+', 'match'))
  • Per suggests that pagefaults ( link ) could be exploited by an external process.
  • Windows System Resource Manager ( http://technet.microsoft.com/en-us/library/cc732553.aspx ) seems to be doing that, but it seems to be specific to Windows Server 2003/8/12. I will check later today and update the thread.
  • Process Lasso ( http://bitsum.com/prolasso.php ) allows to set watchdogs on processes. As it is external to MATLAB, it won't be well suited for setting a timer on a block of code though. I am currently testing this product. It successfully killed MATLAB based on a memory usage watchdog, but it seems to be slowing down MATLAB UI. News 02.01 - Process Lasso seems to be working well, however, if you install it, I highly recommend to set it up so it starts its core engine only when the interface is launched and not when Windows starts, because I can definitely feel the core engine in the background and I see regular disk accesses specific to PL.
  • [Currently in test] Per mentioned WMIC as an option for getting system information; info here and here.
  • [Currently in test] Per and Jan (in this thread) mention the System Information Class for Windows by Xin Zhao. I've been using it for a moment and I will probably update my example function watchog_onLine() based on additional knowledge that I am getting from this class.
Summary Updates:
02/01/2013 - C.Wannaz - Mention of WMIC and SysInfoData.
02/01/2013 - C.Wannaz - Pagefaults (Per), memstats (Benji), P. Lasso update.
01/25/2013 - C.Wannaz - Answer Mathworks, solution from Per (and modified).
01/23/2013 - C.Wannaz - Start.
=== ORIGINAL QUESTION ============================================================
Dear all,
Does anybody know whether it is possible to set watchdogs in MATLAB, e.g. stop execution if memory used by MATLAB > 7GB or execution time > 10min?
I have seen several dead threads about this topic, but also that such a feature seems to exist in the Symbolic Math Toolbox (which I'm not familiar with at all).
I guess that it is always possible to have an external process that conditionally kills MATLAB, or maybe to build a solution based on MEX/multi-threading, but a built-in watchdog feature would be cleaner/simpler.
Thank you and best regards,
Cedric
  6 Comments
Cedric
Cedric on 23 Jan 2013
Hi Daniel, yes it came while I was editing the summary; I will re-edit it periodically with new information. Thank you for the idea about WSRM, I'll check that later today!
per isakson
per isakson on 1 Feb 2013
Edited: per isakson on 1 Feb 2013
A problem with memstat is that it is rather slow
>> tic, T = evalc('feature(''memstats'')'); toc
Elapsed time is 0.034273 seconds.
>> tic, T = evalc('feature(''memstats'')'); toc
Elapsed time is 0.034608 seconds.
However, it might be the best there is. See my question.

Sign in to comment.

Answers (6)

per isakson
per isakson on 24 Jan 2013
Edited: per isakson on 24 Jan 2013
I think it is possible to implement a watchdog based on side effects of conditional breakpoints. Here is a demo of the approach.
>> clear
>> watchdog_demo
Dog is barking!
10 17;
K>>
where
function watchdog_demo()
watchdog( 'exectime', 3 )
dbstop( 'in', 'watchdog_demo.m', 'at', '10', 'if', 'watchdog' )
myfoo_
end
function myfoo_
for ii = 1 : 100
17;
pause(1)
end
end
and
function bark = watchdog( varargin )
persistent start_time time_limit
if nargin == 0
if now - start_time > time_limit
bark = true;
disp('Dog is barking!')
else
bark = false;
end
else
if strcmp( varargin{1}, 'exectime' )
start_time = now;
time_limit = varargin{2}/(24*3600);
else
error('Unknown first input argument')
end
end
end
.
Open issues:
  • the user must know where to put the breakpoints
  • steals some cpu-cycles
  • more complicated if watchdog should handle more than one function simultaneously.
  • and more - I guess
  1 Comment
Cedric
Cedric on 25 Jan 2013
Edited: Cedric on 25 Jan 2013
Thank you for your answer, Per! I will come back on it as soon as I finish reinstalling my machine.

Sign in to comment.


Cedric
Cedric on 26 Jan 2013
Edited: Cedric on 26 Jan 2013
Test script and function last edited on 01.25.2013 9:28PM EST.
Here is the version above proposed by Per Isakson, with a few modifications. It is for the fun of it, because it won't interrupt e.g. a valid operation leading to swapping.
Test script:
watchdog_onLine(9, 'time', 6, 'memory', 1e9) ;
%watchdog_onLine('num2str', 58, 'time', 6, 'memory', 1e9) ;
C = cell(10, 1) ;
for ii = 1:10
fprintf('Iteration #%d..\n', ii) ;
C{ii} = rand(7905) ; % ~500MB matrix.
num2str(65) ; % For testing dbstop in func.
pause(1) ; % Line 9.
end
Function watchdog_onLine :
function bark = watchdog_onLine(varargin)
persistent Time ;
persistent Memory ;
parser = inputParser ;
parser.StructExpand = true ;
parser.CaseSensitive = false ;
parser.addOptional('fileLine', 0, @(x)ischar(x)||isnumeric(x)) ;
parser.addOptional('lineNo', 0, @isnumeric) ;
parser.addParamValue('time', 0, @isnumeric) ;
parser.addParamValue('memory', 0, @isnumeric) ;
parser.parse(varargin{:}) ;
args = parser.Results ;
bark = false ;
if nargin == 0
if Time.limit > 0 && (toc(Time.start) > Time.limit)
bark = true ;
fprintf('WATCHDOG: execution time > limit=%g!\n', Time.limit) ;
end
if Memory.limit > 0
[~, sv] = memory() ;
if (Memory.start - sv.VirtualAddressSpace.Available ...
> Memory.limit)
bark = true ;
fprintf('WATCHDOG: memory usage > limit=%g!\n', ...
Memory.limit) ;
end
end
else
if ischar(args.fileLine)
filename = args.fileLine ;
lineNo = args.lineNo ;
else
dbs = dbstack() ;
filename = dbs(2).file ;
lineNo = args.fileLine ;
end
if lineNo < 1, error('Line # < 1?') ; end
dbstop('in', filename, 'at', num2str(lineNo), 'if', mfilename()) ;
Time.start = tic ;
Time.limit = args.time ;
[~, sv] = memory() ;
Memory.start = sv.VirtualAddressSpace.Available ;
Memory.limit = args.memory ;
end
end
When run with a 2s time limit, we get:
>> watchdog_onLine_demo
Iteration #1..
Iteration #2..
WATCHDOG: execution time > limit=2!
When run with a 1GB memory limit, we get:
>> watchdog_onLine_demo
Iteration #1..
Iteration #2..
Iteration #3..
WATCHDOG: memory usage > limit=1e+09!
  5 Comments
Jason Ross
Jason Ross on 4 Apr 2013
For prototyping with WMI, I've found the Scriptomatic tool from Microsoft to be indespensible. You pick the class you are interested in and it will generate a working script in a variety of languages. You can be up and running very quickly without having to invest the time to (perhaps) learn a new language, and you can find out if the WMI counter you think you want to use is actually delivering the information you are interested in.

Sign in to comment.


Jing
Jing on 22 Jan 2013
Hi,
The watchdog for memory in Symbolic Math Toolbox is for Mupad session, which means it's not for MATLAB session. I think the better way is to run in the debug mode using 'dbstop'.
  2 Comments
Cedric
Cedric on 22 Jan 2013
Edited: Cedric on 22 Jan 2013
Hi Jing,
     |dbstop| won't allow such memory/time management, because there is no mechanism (up to my knowledge) that would throw exceptions when crossing a memory/time user-defined limit; such limit would have also nothing to do with the function calls stack.
My point about Sym Tbx was that such mechanism exists somewhere (unless this tbx is a completely separate product/engine), so it should be implemented somehow in MATLAB, maybe without being documented(?) Walter made a step in this direction in his comment about Cody.
Walter Roberson
Walter Roberson on 22 Jan 2013
The Symbolic Toolbox really is handled differently. Mathworks bought an existing software product (MuPAD) which is what gets called upon to do the work.

Sign in to comment.


Bjorn Gustavsson
Bjorn Gustavsson on 22 Jan 2013
If you look at the documentation for the ode-solvers, you find that for those you can send in an events-function in the options. So for those functions you can make an event-function that triggers an event after a given run-time and the terminate the integration (this is I think is not the way it is usually done...)
For some other functions I guess you could do something similar - optimization functions I'd think you could build something like this for. Other tasks seems more difficult to interrupt (matrix inversion, SVD...)
HTH
  2 Comments
Cedric
Cedric on 23 Jan 2013
Edited: Cedric on 23 Jan 2013
Thank you Bjorn, I never thought about that (and I will investigate a little). I had in mind something more generic actually, as you mention in your second paragraph. I must occasionally shutdown manually my machine (on/off switch) because it becomes unresponsive (mouse moving slowly, no click taken), for example when my sparse matrices become too dense and my system starts swapping on disk. So I will need something less specific than ode tools sadly.
Bjorn Gustavsson
Bjorn Gustavsson on 23 Jan 2013
Well, my suggestion was given with the implicit hope that your problem was "spatially constrained" to a reasonable number of points where you could reach (that is not within built-in functions) and implement a cut-it-off mechanism. If either of those 2 conditions fall the suggestion goes up into a puff of disappointing.

Sign in to comment.


Daniel Shub
Daniel Shub on 23 Jan 2013
Edited: Daniel Shub on 23 Jan 2013
On Linux the "external process" is pretty simple
$ ulimit -t 600 -v 7000000
$ matlab
It might be -m instead of -v, I don't really understand the ulimit options. If you exceed these values (600 seconds or 7 GB of memory), then the process (i.e., MATLAB) will get killed. This won't prevent swapping, but rather the total memory exceeding 7GB. If something else is taking memory you will still swap.
Using ulimit to limit the timing is not ideal since if you do a lot of things in MATLAB eventually you will exceed the cpu time allowance. My guess is you want to limit a single "command" to 10 minutes of cpu time. Do do this with ulimit, you would need to restart MATLAB after every "command".
I am not convinced you really want to limit the time (especially the cpu time and not wall clock time).
EDIt On Windows it seems like Windows System Resource Manager might do the trick.
  5 Comments
Jason Ross
Jason Ross on 23 Jan 2013
@Daniel -- thanks for that link. It looks very interesting as a means of controlling things in the Windows world. Your points are very valid regarding UNIX. I haven't delved down deep into limits.conf in some time, but given the roots of UNIX (specifically how to share system resources amoung a pool of users), it's something that's been thought about and dealt with more "over there" than in the Windows world (where sharing isn't as common)

Sign in to comment.


Oliver Woodford
Oliver Woodford on 30 Jun 2014
I use timers to check certain things, then exit MATLAB if things go wrong. E.g. the following timer function:
function kill_signal_t(varargin)
if kill_signal()
quit force;
end
This stops the currently running code immediately, but it also terminates the MATLAB instance.
Ideally I'd like something less, drastic, like throwing an error, but this gets caught (even if I use evalin('base', ...), and doesn't interrupt the main thread :(

Products

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!