結果:
You may have come across code that looks like that in some languages:
stubFor(get(urlPathEqualTo("/quotes"))
.withHeader("Accept", equalTo("application/json"))
.withQueryParam("s", equalTo(monitoredStock))
.willReturn(aResponse())
.withStatus(200)
.withHeader("Content-Type", "application/json")
.withBody("{\\"symbol\\": \\"XYZ\\", \\"bid\\": 20.2, " + "\\"ask\\": 20.6}")))
That’s Java. Even if you can’t fully decipher it, you can get a rough idea of what it is supposed to do, build a rather complex API query.
Or you may be familiar with the following similar and frequent syntax in Python:
import seaborn as sns
sns.load_dataset('tips').sample(10, random_state=42).groupby('day').mean()
Here’s is how it works: multiple method calls are linked together in a single statement, spanning over one or several lines, usually because each method returns the same object or another object that supports further calls.
That technique is called method chaining and is popular in Object-Oriented Programming.
A few years ago, I looked for a way to write code like that in MATLAB too. And the answer is that it can be done in MATLAB as well, whevener you write your own class!
Implementing a method that can be chained is simply a matter of writing a method that returns the object itself.
In this article, I would like to show how to do it and what we can gain from such a syntax.
Example
A few years ago, I first sought how to implement that technique for a simulation launcher that had lots of parameters (far too many):
lauchSimulation(2014:2020, true, 'template', 'TmplProd', 'Priority', '+1', 'Memory', '+6000')
As you can see, that function takes 2 required inputs, and 3 named parameters (whose names aren’t even consistent, with ‘Priority’ and ‘Memory’ starting with an uppercase letter when ‘template’ doesn’t).
(The original function had many more parameters that I omit for the sake of brevity. You may also know of such functions in your own code that take a dozen parameters which you can remember the exact order.)
I thought it would be nice to replace that with:
SimulationLauncher() ...
.onYears(2014:2020) ...
.onDistributedCluster() ... % = equivalent of the previous "true"
.withTemplate('TmplProd') ...
.withPriority('+1') ...
.withReservedMemory('+6000') ...
.launch();
The first 6 lines create an object of class SimulationLauncher, calls several methods on that object to set the parameters, and lastly the method launch() is called, when all desired parameters have been set.
To make it cleared, the syntax previously shown could also be rewritten as:
launcher = SimulationLauncher();
launcher = launcher.onYears(2014:2020);
launcher = launcher.onDistributedCluster();
launcher = launcher.withTemplate('TmplProd');
launcher = launcher.withPriority('+1');
launcher = launcher.withReservedMemory('+6000');
launcher.launch();
Before we dive into how to implement that code, let’s examine the advantages and drawbacks of that syntax.
Benefits and drawbacks
Because I have extended the chained methods over several lines, it makes it easier to comment out or uncomment any one desired option, should the need arise. Furthermore, we need not bother any more with the order in which we set the parameters, whereas the usual syntax required that we memorize or check the documentation carefully for the order of the inputs.
More generally, chaining methods has the following benefits and a few drawbacks:
Benefits:
- Conciseness: Code becomes shorter and easier to write, by reducing visual noise compared to repeating the object name.
- Readability: Chained methods create a fluent, human-readable structure that makes intent clear.
- Reduced Temporary Variables: There's no need to create intermediary variables, as the methods directly operate on the object.
Drawbacks:
- Debugging Difficulty: If one method in a chain fails, it can be harder to isolate the issue. It effectively prevents setting breakpoints, inspecting intermediate values, and identifying which method failed.
- Readability Issues: Overly long and dense method chains can become hard to follow, reducing clarity.
- Side Effects: Methods that modify objects in place can lead to unintended side effects when used in long chains.
Implementation
In the SimulationLauncher class, the method lauch performs the main operation, while the other methods just serve as parameter setters. They take the object as input and return the object itself, after modifying it, so that other methods can be chained.
classdef SimulationLauncher
properties (GetAccess = private, SetAccess = private)
years_
isDistributed_ = false;
template_ = 'TestTemplate';
priority_ = '+2';
memory_ = '+5000';
end
methods
function varargout = launch(obj)
% perform whatever needs to be launched
% using the values of the properties stored in the object:
% obj.years_
% obj.template_
% etc.
end
function obj = onYears(obj, years)
assert(isnumeric(years))
obj.years_ = years;
end
function obj = onDistributedCluster(obj)
obj.isDistributed_ = true;
end
function obj = withTemplate(obj, template)
obj.template_ = template;
end
function obj = withPriority(obj, priority)
obj.priority_ = priority;
end
function obj = withMemory( obj, memory)
obj.memory_ = memory;
end
end
end
As you can see, each method can be in charge of verifying the correctness of its input, independantly. And what they do is just store the value of parameter inside the object. The class can define default values in the properties block.
You can configure different launchers from the same initial object, such as:
launcher = SimulationLauncher();
launcher = launcher.onYears(2014:2020);
launcher1 = launcher ...
.onDistributedCluster() ...
.withReservedMemory('+6000');
launcher2 = launcher ...
.withTemplate('TmplProd') ...
.withPriority('+1') ...
.withReservedMemory('+7000');
If you call the same method several times, only the last recorded value of the parameter will be taken into acount:
launcher = SimulationLauncher();
launcher = launcher ...
.withReservedMemory('+6000') ...
.onDistributedCluster() ...
.onYears(2014:2020) ...
.withReservedMemory('+7000') ...
.withReservedMemory('+8000');
% The value of "memory" will be '+8000'.
If the logic is still not clear to you, I advise you play a bit with the debugger to better understand what’s going on!
Conclusion
I love how the method chaining technique hides the minute detail that we don’t want to bother with when trying to understand what a piece of code does.
I hope this simple example has shown you how to apply it to write and organise your code in a more readable and convenient way.
Let me know if you have other questions, comments or suggestions. I may post other examples of that technique for other useful uses that I encountered in my experience.
If you use tables extensively to perform data analysis, you may at some point have wanted to add new functionalities suited to your specific applications. One straightforward idea is to create a new class that subclasses the built-in table class. You would then benefit from all inherited existing methods.
One workaround is to create a new class that wraps a table as a Property, and re-implement all the methods that you need and are already defined for table. The is not too difficult, except for the subsref method, for which I’ll provide the code below.
Class definition
Defining a wrapper of the table class is quite straightforward. In this example, I call the class “Report” because that is what I intend to use the class for, to compute and store reports. The constructor just takes a table as input:
classdef Rapport
methods
function obj = Report(t)
if isa(t, 'Report')
obj = t;
else
obj.t_ = t;
end
end
end
properties (GetAccess = private, SetAccess = private)
t_ table = table();
end
end
I designed the constructor so that it converts a table into a Report object, but also so that if we accidentally provide it with a Report object instead of a table, it will not generate an error.
Reproducing the behaviour of the table class
Implementing the existing methods of the table class for the Report class if pretty easy in most cases.
I made use of a method called “table” in order to be able to get the data back in table format instead of a Report, instead of accessing the property t_ of the object. That method can also be useful whenever you wish to use the methods or functions already existing for tables (such as writetable, rowfun, groupsummary…).
classdef Rapport
...
methods
function t = table(obj)
t = obj.t_;
end
function r = eq(obj1,obj2)
r = isequaln(table(obj1), table(obj2));
end
function ind = size(obj, varargin)
ind = size(table(obj), varargin{:});
end
function ind = height(obj, varargin)
ind = height(table(obj), varargin{:});
end
function ind = width(obj, varargin)
ind = width(table(obj), varargin{:});
end
function ind = end(A,k,n)
% ind = end(A.t_,k,n);
sz = size(table(A));
if k < n
ind = sz(k);
else
ind = prod(sz(k:end));
end
end
end
end
In the case of horzcat (same principle for vertcat), it is just a matter of converting back and forth between the table and Report classes:
classdef Rapport
...
methods
function r = horzcat(obj1,varargin)
listT = cell(1, nargin);
listT{1} = table(obj1);
for k = 1:numel(varargin)
kth = varargin{k};
if isa(kth, 'Report')
listT{k+1} = table(kth);
elseif isa(kth, 'table')
listT{k+1} = kth;
else
error('Input must be a table or a Report');
end
end
res = horzcat(listT{:});
r = Report(res);
end
end
end
Adding a new method
The plus operator already exists for the table class and works when the table contains all numeric values. It sums columns as long as the tables have the same length.
Something I think would be nice would be to be able to write t1 + t2, and that would perform an outerjoin operation between the tables and any sizes having similar indexing columns.
That would be so concise, and that's what we’re going to implement for the Report class as an example. That is called “plus operator overloading”. Of course, you could imagine that the “+” operator is used to compute something else, for example adding columns together with regard to the keys index. That depends on your needs.
Here’s a unittest example:
classdef ReportTest < matlab.unittest.TestCase
methods (Test)
function testPlusOperatorOverload(testCase)
t1 = array2table( ...
{ 'Smith', 'Male' ...
; 'JACKSON', 'Male' ...
; 'Williams', 'Female' ...
} , 'VariableNames', {'LastName' 'Gender'} ...
);
t2 = array2table( ...
{ 'Smith', 13 ...
; 'Williams', 6 ...
; 'JACKSON', 4 ...
}, 'VariableNames', {'LastName' 'Age'} ...
);
r1 = Report(t1);
r2 = Report(t2);
tRes = r1 + r2;
tExpected = Report( array2table( ...
{ 'JACKSON' , 'Male', 4 ...
; 'Smith' , 'Male', 13 ...
; 'Williams', 'Female', 6 ...
} , 'VariableNames', {'LastName' 'Gender' 'Age'} ...
) );
testCase.verifyEqual(tRes, tExpected);
end
end
end
And here’s how I’d implement the plus operator in the Report class definition, so that it also works if I add a table and a Report:
classdef Rapport
...
methods
function r = plus(obj1,obj2)
table1 = table(obj1);
table2 = table(obj2);
result = outerjoin(table1, table2 ...
, 'Type', 'full', 'MergeKeys', true);
r = reportingits.dom.Rapport(result);
end
end
end
The case of the subsref method
If we wish to access the elements of an instance the same way we would with regular tables, whether with parentheses, curly braces or directly with the name of the column, we need to implement the subsref and subsasgn methods. The second one, subsasgn is pretty easy, but subsref is a bit tricky, because we need to detect whether we’re directing towards existing methods or not.
Here’s the code:
classdef Rapport
...
methods
function A = subsasgn(A,S,B)
A.t_ = subsasgn(A.t_,S,B);
end
function B = subsref(A,S)
isTableMethod = @(m) ismember(m, methods('table'));
isReportMethod = @(m) ismember(m, methods('Report'));
switch true
case strcmp(S(1).type, '.') && isReportMethod(S(1).subs)
methodName = S(1).subs;
B = A.(methodName)(S(2).subs{:});
if numel(S) > 2
B = subsref(B, S(3:end));
end
case strcmp(S(1).type, '.') && isTableMethod (S(1).subs)
methodName = S(1).subs;
if ~isReportMethod(methodName)
error('The method "%s" needs to be implemented!', methodName)
end
otherwise
B = subsref(table(A),S(1));
if istable(B)
B = Report(B);
end
if numel(S) > 1
B = subsref(B, S(2:end));
end
end
end
end
end
Conclusion
I believe that the table class is Sealed because is case new methods are introduced in MATLAB in the future, the subclass might not be compatible if we created any or generate unexpected complexity.
The table class is a really powerful feature.
I hope this example has shown you how it is possible to extend the use of tables by adding new functionalities and maybe given you some ideas to simplify some usages. I’ve only happened to find it useful in very restricted cases, but was still happy to be able to do so.
In case you need to add other methods of the table class, you can see the list simply by calling methods(’table’).
Feel free to share your thoughts or any questions you might have! Maybe you’ll decide that doing so is a bad idea in the end and opt for another solution.