fast_sorted_mask

バージョン 1.2.4 (685 KB) 作成者: Bryce Henson
fast masking of ordered vectors based on binary search
ダウンロード: 18
更新 2021/12/2

# fast_sorted_mask
**Bryce M. Henson**
Matlab code for fast masking/selection of ordered vectors based on binary search.
**Status:** This Code **is ready for use in other projects**. Testing is implemented and passing.

Selecting the elements of a vector that lie between some limits (herein *masking*, sometimes known as 'gating') is a widely used analytical tool (eg. particle physics).
The common approach to masking data involves comparing each element (n) to the upper and lower limit (herein *Brute compare*) has complexity O(n). The novel contribution of this code is a demonstration of a relatively simple approach that uses a binary search algorithm ( O(log(n)) ) on an ordered vector to achieve superior performance in two use cases.
1. Data that is already sorted ( O(log(n)) cf. brute O(n) ).
2. When there is a requirement to repeatedly (m times) mask the same data such that the inial cost of the sort is offset by the increased speed of the mask operation. (O(n·log(n)+m·log(n)) cf. O(n·m) )
Note that if m is small and you check that the data is ordered (eg issorted(data)) you have probably lost most of any potential speedup already.

There are two things that the user may want from this masking operation:
1. Returning the number of data points(or counts) in this mask (herein *return mask count*).
* This is where the search based algorithm really shines compared to the brute mask (as it just subtracts the upper and lower index while the brute compare must count up the logical vector, see details).
2. Return the values of the data that makes it through this mask. (herein *retun mask values*)
* This has a smaller speedup because copying a subset of a vector (even a contiguous block) is a substantially slower than the search.

The code here demonstrates the algorithm for both types (counting and subset) in native matlab and provides a number of tests in order to compare the performance.
For taking small slices of large (>1e6 elements) sorted vectors, a speedup of 1000x is demonstrated.

## Usage Examples
The brute compare implementation is very easy by using a logical mask vector:
```
mask=data>lower_lim & data<upper_lim
```
(If you are not familaiar with Logical indexing [read more here](https://blogs.mathworks.com/loren/2013/02/20/logical-indexing-multiple-conditions/) )
the number of counts (integer) may be extracted as
```
num_mask=sum(mask)
```
or the subset of data points (vector)
```
subset_mask=data(mask)
```
The equivelent (but faster) operation using fast_sorted_mask on ordered data is:
```
mask_idx=fast_sorted_mask(data,lower_lim,upper_lim);
num_mask=mask_idx(2)-mask_idx(1)+1;
subset_mask=data(mask_idx(1):mask_idx(2));
```
WARNING: the data vector MUST be sorted. See figures above for when it is still advantagous to sort unordered data and then use this approach.

引用

Bryce Henson (2024). fast_sorted_mask (https://github.com/brycehenson/fast_sorted_mask), GitHub. 取得済み .

MATLAB リリースの互換性
作成: R2019a
すべてのリリースと互換性あり
プラットフォームの互換性
Windows macOS Linux
タグ タグを追加

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

GitHub の既定のブランチを使用するバージョンはダウンロードできません

バージョン 公開済み リリース ノート
1.2.4

Fix logo labels

1.2.3

corrected description

1.2.2

added logo

1.2.1

include acknowledgment "Smart"/Silent Figure by Daniel Eaton

1.2

この GitHub アドオンでの問題を表示または報告するには、GitHub リポジトリにアクセスしてください。
この GitHub アドオンでの問題を表示または報告するには、GitHub リポジトリにアクセスしてください。