Masking bits in python

Utility Functions for Handling Bit Masks and Mask Arrays¶

It is common to use bit fields, such as integer variables whose individual bits represent some attributes, to characterize the state of data. For example, Hubble Space Telescope (HST) uses arrays of bit fields to characterize data quality (DQ) of HST images. See, for example, DQ field values for WFPC2 image data (see Table 3.3) and WFC3 image data (see Table 3.3). As you can see, the meaning assigned to various bit flags for the two instruments is generally different.

Bit fields can be thought of as tightly packed collections of bit flags. Using masking we can “inspect” the status of individual bits.

One common operation performed on bit field arrays is their conversion to boolean masks, for example, by assigning boolean True (in the boolean mask) to those elements that correspond to non-zero-valued bit fields (bit fields with at least one bit set to 1 ) or, oftentimes, by assigning True to elements whose corresponding bit fields have only specific fields set (to 1 ). This more sophisticated analysis of bit fields can be accomplished using bit masks and the aforementioned masking operation.

The bitmask module provides two functions that facilitate conversion of bit field arrays (i.e., DQ arrays) to boolean masks: bitfield_to_boolean_mask converts an input bit field array to a boolean mask using an input bit mask (or list of individual bit flags) and interpret_bit_flags creates a bit mask from an input list of individual bit flags.

Читайте также:  What are annotations java

Creating Boolean Masks¶

Overview¶

bitfield_to_boolean_mask by default assumes that all input bit fields that have at least one bit turned “ON” corresponds to “bad” data (i.e., pixels) and converts them to boolean True in the output boolean mask (otherwise output boolean mask values are set to False ).

Often, for specific algorithms and situations, some bit flags are okay and can be ignored. bitfield_to_boolean_mask accepts lists of bit flags that by default must be ignored in the input bit fields when creating boolean masks.

Fundamentally, by default, bitfield_to_boolean_mask performs the following operation:

(1) boolean_mask = (bitfield & ~bit_mask) != 0

(Here & is bitwise and while ~ is the bitwise not operation.) In the previous formula, bit_mask is a bit mask created from individual bit flags that need to be ignored in the bit field.

Источник

Bitmask in python dynamic programming code example

Solution 3: Searching for bit patterns within byte data is a little more challenging than typical searches. The Mask object is constructed with a source value and a source mask, both left aligned and (in this implementation) 32-bits long: The buffer is a list of byte values: A Mask object generates an internal list of shifted masks: The middle element is the exact match substring (there is no neat way to get this out of this object as is, but it’s ).

How do bit masks work in Python?

Computers represent integers with bits. This is a binary representation of a number (that is, using base 2). The only numbers in binary are 0 and 1, or «off» and «on». If you are not familiar with binary, you should read up on it, but basically you count in binary like this:

0000 0001 0010 0011 0100 0101 0110 

and so on. each column can be represented by 2^n starting with 0. So the number 0101 = 2^3*0+2^2*1+2^1*0+2^0*1 = 5 . Now when you «bit mask» something, you are basically just looking at the bits of value to you. In your case, you are only looking at the » 2^3 » bit. This is easily done by simply multiplying each bit with the corresponding bit in the mask. This can be useful for a lot of things. Sometimes we assign meaning to each bit and telling whether it is on or off is very important.

In your example. If we passed in 13, this would happen:

13 means 1101 in computer speak 1101 mask with 1000. Work on each bit individually: 1 * 1 = 1 1 * 0 = 0 0 * 0 = 0 1 * 0 = 0 checker = 1000 which means 8. 8 > 0 so return on. 

An example returning false with 5:

5 means 0101 in computer speak 0101 mask with 1000. Work on each bit individually: 0 * 1 = 0 1 * 0 = 0 0 * 0 = 0 1 * 0 = 0 checker = 0000 which means 0. 0 is not > 0 so return off. 

Hope this helps. You should be able to find extensive information about this stuff on the google machine.

Python — Generating permutations using Bitmasking, The code works by building a new string starting from an empty string. Whenever any character is added, the bitmask records it. Then the string is sent deeper into recursion for further addition of characters. When the code returns from recursion, then the added character is to be removed and the bitmask value …

Episode 20 — Bitmask Dynamic Programming

This week’s episode will cover techniques for embedding bitmasks into dynamic programming states. I’ll also discuss the «rolling bitmask » technique.02:20 — F

Python bitmask (variable length)

Your algorithm is the naive way to search for «strings» in data, but luckily there are much better algorithms. One example is the KMP algorithm, but there are others that might fit your use case better.

With a better algorithm you can go from complexity of O(n*m) to O(n+m) .

Where your mask is a multiple of 8 bits, your search becomes a relatively trivial byte comparison and any substring search algorithm will suffice (I would not recommend converting to a string and using the built in search, since you will likely suffer from failed character validation issues.)

sequence = mask = [0b10010001, 0b01101101] matches = my_substring_search(sequence, mask) 

For a mask of greater than 8 bits but not a multiple of eight, I would suggest truncating the mask to a multiple of 8 and using the same substring search as above. Then for any matches found, you can test the remainder.

sequence = mask_a = [0b10010001] mask_b = 0b01100000 mask_b_pattern = 0b11110000 # relevant bits of mask_b matches = my_substring_search(sequence, mask_a) for match in matches: if (sequence[match+len(mask_a)] & mask_b_pattern) == mask_b: valid_match = True # or something more useful. 

If sequence is a list of 4096 bytes, you may need to account for the overlap between sections. This can be easily done by making sequence a list of 4096+ceil(mask_bits/8.0) bytes, but still advancing by only 4096 each time you read the next block.

As a demonstration of generating and using these masks:

class Mask(object): def __init__(self, source, source_mask): self._masks = list(self._generate_masks(source, source_mask)) def match(self, buffer, i, j): return any(m.match(buffer, i, j) for m in self._masks) class MaskBits(object): def __init__(self, pre, pre_mask, match_bytes, post, post_mask): self.match_bytes = match_bytes self.pre, self.pre_mask = pre, pre_mask self.post, self.post_mask = post, post_mask def __repr__(self): return '(%02x %02x) (%s) (%02x %02x)' % (self.pre, self.pre_mask, ', '.join('%02x' % m for m in self.match_bytes), self.post, self.post_mask) def match(self, buffer, i, j): return (buffer[i:j] == self.match_bytes and buffer[i-1] & self.pre_mask == self.pre and buffer[j] & self.post_mask == self.post) def _generate_masks(self, src, src_mask): pre_mask = 0 pre = 0 post_mask = 0 post = 0 while pre_mask != 0xFF: src_bytes = [] for i in (24, 16, 8, 0): if (src_mask >> i) & 0xFF == 0xFF: src_bytes.append((src >> i) & 0xFF) else: post_mask = (src_mask >> i) & 0xFF post = (src >> i) & 0xFF break yield self.MaskBits(pre, pre_mask, src_bytes, post, post_mask) pre += pre pre_mask += pre_mask if src & 0x80000000: pre |= 0x00000001 pre_mask |= 0x00000001 src = (src & 0x7FFFFFFF) * 2 src_mask = (src_mask & 0x7FFFFFFF) * 2 

This code is not a complete search algorithm, it forms part of validating matches. The Mask object is constructed with a source value and a source mask, both left aligned and (in this implementation) 32-bits long:

src = 0b11101011011011010101001010100000 src_mask = 0b11111111111111111111111111100000 

The buffer is a list of byte values:

buffer_1 = [0x7e, 0xb6, 0xd5, 0x2b, 0x88] 

A Mask object generates an internal list of shifted masks:

>> m = Mask(src, src_mask) >> m._masks [(00 00) (eb, 6d, 52) (a0 e0), (01 01) (d6, da, a5) (40 c0), (03 03) (ad, b5, 4a) (80 80), (07 07) (5b, 6a, 95) (00 00), (0e 0f) (b6, d5) (2a fe), (1d 1f) (6d, aa) (54 fc), (3a 3f) (db, 54) (a8 f8), (75 7f) (b6, a9) (50 f0)] 

The middle element is the exact match substring (there is no neat way to get this out of this object as is, but it’s m._masks[i].match_bytes ). Once you have used an efficient algorithm to find this subsequence, you can verify the surrounding bits using m.match(buffer, i, j) , where i is the index of first matching byte and j is the index of the byte after the last matching byte (such that buffer[i:j] == match_bytes ).

In buffer above, the bit sequence can be found starting at bit 5, which means that _masks[4].match_bytes can be found at buffer[1:3] . As a result:

(Feel free to use, adapt, modify, sell or torture this code in any way possible. I quite enjoyed putting it together — an interesting problem — though I won’t be held liable for any bugs, so make sure you test it thoroughly!)

Searching for bit patterns within byte data is a little more challenging than typical searches. The usual algorithms don’t always work well as there is a cost for extracting each bit from the byte data and there is only an ‘alphabet’ of two characters, so just by chance 50% of comparisons will match (this makes many algorithms much less efficient).

You mentioned trying the bitstring module (which I wrote) but that it was too slow. That’s not too surprising to me so if anyone has any great ideas on how to do this I’m paying attention! But the way bitstring does it suggests a possible speed up for you:

To do the match bitstring converts chunks of the byte data into ordinary strings of ‘0’s and ‘1’, and then uses Python’s find method to do a quick search. Lots of the time is spent converting the data to a string, but as you are searching in the same data multiple times there’s a big saving to be had.

masks = ['0000101010100101', '010100011110110101101', '01010101101'] byte_data_chunk = bytearray('blahblahblah') # convert to a string with one character per bit # uses lookup table not given here! s = ''.join(BYTE_TO_BITS[x] for x in byte_data_chunk) for mask in masks: p = s.find(mask) # etc. 

The point is that once you convert to an ordinary string you can use the built-in find , which is very well optimised, to search for each of your masks. When you used bitstring it had to do the conversion for every mask which would have killed performance.

Bit manipulation — Bit masking in Python, Most of your value* constants aren’t actually bit masks, only value7 and value8 are. I’d define another bit mask to extract the lower bits, so I would have three bit masks in total: mask0 = 0x07 mask1 = 0x40 mask2 = 0x80. Now your function becomes. def parse_byte (byte): return byte & mask2, byte & mask1, byte & mask0. Code sampledef parse_byte(byte):value7_set = byte & value7 == value7value8_set = byte & value8 == value8base_value = mask_bits_on_byte(byte,value7 | value8)return value7_set,value8_set,base_valueFeedback

Dynamic Programming +Bit-Masks

You don’t actually need DP for this one but you can use bit manipulation nicely 🙂 Since A

const int A = 514; const int B = 2; vector v; //contains digits of A (e.g. 5, 1, 4) this can be done before the recursive function in a while loop. int rec(int mask, int current_number) < if(mask == (1 int ret = 0; for(int i = 0; i < v.size(); i++)< if(mask & (1 return ret; > 

Note that the reason I didn’t use DP here was that current number might differ even if mask is the same; so you can’t actually say that the situation has been repeated. Unless you memo-ize mask AND current_number which requires much more space.

Bitmasks: A very esoteric (and impractical) way of, For example you can have presets like admin = 24; user = 17; poweruser = 31;. And just set config = admin and you are good to go rather than assigning many variables. Another example, bit masks are used in Unity 3D engine to set the layers a Camera renders in runtime. Unity has two parts, an editor …

Bitmask parsing in python using only standard library

Arithmetic done elegantly, without C-style proceduralism:

>>> masklen = 8 >>> [bool(int(i)) for i in str(bin(235))[2:].rjust(masklen, '0')] [True, True, True, False, True, False, True, True] 

So if you skip pack step and just use the integer:

def bitboollist(v,n=0): l = [] t = v while t != 0: l.append(bool(t % 2)) t = t / 2 l.reverse() if len(l) == 0: l = [False] if n > len(l): l = [False]*(n-len(l)) + l return l 

using that on an example 1234 yields:

>>> bitboollist(1234) [True, False, False, True, True, False, True, False, False, True, False] >>> bitboollist(1234,n=16) [False, False, False, False, False, True, False, False, True, True, False, True, False, False, True, False] 

How do bit masks work in Python?, In your example. If we passed in 13, this would happen: 13 means 1101 in computer speak 1101 mask with 1000. Work on each bit individually: 1 * 1 = 1 1 * 0 = 0 0 * 0 = 0 1 * 0 = 0 checker = 1000 which means 8. 8 > 0 so return on. An example returning false with 5: 5 means 0101 in computer speak 0101 mask …

Источник

Оцените статью