How to store sequence of bits as a bit stream and use the least possible memory ?

5 ビュー (過去 30 日間)
Teo Protoulis
Teo Protoulis 2020 年 1 月 10 日
編集済み: Matt J 2020 年 1 月 10 日
I am implementing a huffman encoding procedure which for example encodes the decimal number -5 as '100010'. Until now I have implemented some code which correctly produces the encoded sequence. The encoded sequence is stored as a string. However, I want the bit sequence to be stored as a bit stream and use only the neccesary amount of space for storage. For example, I want the encoded sequence for -5, which is '100010', to take 6 bits of memory. Is that possible ?
  1 件のコメント
Guillaume
Guillaume 2020 年 1 月 10 日
Note that in all modern processors and certainly all that run matlab (x86/x64) the minimum unit of memory is 8 bits and everything is in multiple of 8.

サインインしてコメントする。

回答 (3 件)

Matt J
Matt J 2020 年 1 月 10 日
編集済み: Matt J 2020 年 1 月 10 日
You could chain all your strings together as a logical column vector and use bwpack, see
Example:
>> A='10001010001010001011010100011000100010100010';
>> B=bwpack((A-'0').');
>> whos A B
Name Size Bytes Class Attributes
A 1x44 88 char
B 2x1 8 uint32

Stephane Dauvillier
Stephane Dauvillier 2020 年 1 月 10 日
You can use the functions:
  1. dec2bin to tranform decimal number into binary,
  2. string to transform the output of the dec2bin from char to string
  3. regexprep to replace the first 0 by empty
  4. strjoin to join everything
For instance:
strjoin(regexprep(string(dec2bin([66,8,19,7])),"^0+","","once"),"")

Guillaume
Guillaume 2020 年 1 月 10 日
A bit similar to Matt's suggestion, but may be slightly more memory efficient since bwpack uses multiples of 32-bit, is to:
  • split your stream into block of 8-bits
  • pad the last block to 8 bits if shorter than 8.
  • convert each block to uint8, the smallest unit of storage on x86/x64
eg.
bitstream = char(randi(double('01'), 1, 19)) %random stream of 19 0-1 characters
encoded = bitstream - '0'; %convert to binary vector. Uses 4 times as much memory as bitstream
encoded = [encoded, zeros(1, mod(-numel(bitstream), 8))]; %pad with zeros at the to multiple of 8
encoded = uint8(sum(reshape(encoded, 8, []) .* 2.^(7:-1:0)')) %convert each sequence of 8 bits into 8-bit integer. 1st bit is MSB
You can easily convert back to the (padded) bitstream:
padded_origstream = reshape(dec2bin(encoded, 8)', 1, [])

カテゴリ

Help Center および File ExchangeLarge Files and Big Data についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by