binarysearchmodified

Improvement Of Binary Search

Interlaced Binary Search: Multiple Items’ Binary Search in One Run

by Aloke Sarkar

Abstract— The interlaced binary search (IBS) is developed to counter deficiencies of binary search (BS) that are inability to search multiple (n) items in one scan through the sorted list of 2q items and search efficiency is limited by q. IBS is specially developed to be implemented in hardware using digital logic gates. IBS brakes down the sorted list into small groups and apply separately BS in each group. Groups are formed as per input items that are to be searched for. For sufficient number of input items IBS is advantageous over BS. Two implementations of IBS are also defined – multiple inputs analog to digital converter without input multiplexing instead parallel read and compare, and data encoding.

Index Terms— Algorithms, Data compression, Delta modulation, Digital arithmetic, Information theory.

I. INTRODUCTION

I

n this paper we are defining a search algorithm that is termed as Interlaced Binary Search (IBS) and its possible applications – two cases. IBS is developed to counter deficiencies of binary search (BS). In one scan through a sorted list of 2q items BS can search only one item using q trials. A scan (run) is the process of looking through all 2q items of the list. To search n items BS requires n scans. IBS can search n items in one scan with combined efficiency of search being better than that of BS. Presently we have defined two applications of IBS that are (1) interlaced analog to digital converter (IADC) [1,2,3,4] – to replace multiplexed inputs successive approximation analog to digital converter (A/D) (Figure-1and Table-3) and (2) encoding of binary data using fewer bits (section-4). IBS is an improvement of BS to search multiple items in one scan and requires some form of parallel processing to cope simultaneously with multiple items that can be simulated sequentially in software. Other than improvement over BS, IBS does not claim any improvement over any other search procedures, that is a matter of future research. Also the algorithm is developed on targeting lower level machine and uses go to and jump statements. IBS for higher-level machines and subsequent applications are a matter of future research. For an easy understanding please refer IBS: Illustration & Simulation. Table-1 lists symbol/abbreviations (case insensitive – upper/lower case same meaning).

TABLE 1: ABBREVIATIONS / SYMBOLS (CASE INSENSITIVE)

TABLE 2: LIST OF DRAWINGS

TABLE 3: IBS ALGORITHM & IADC (Fig-1) CORRELATION

TABLE II

LIST OF DRAWINGS

Figure 1: Basic Components of an IADC.

Figure 2: Flow Chart of P1 of IBS.

Figure 3: Flow Chart of P2 initialization.

Figure 4: Flow Chart of P2 of IBS.

Figure 5: Flow Chart of P3 Of IBS.

Figure 6: CHK_AD Flow Chart (e=0).

Figure 7: CHK_AD Flow Chart (e≠0).

Figure 8: Fig-7 continuation.

Figure 9: Fig-6 and Fig-7 continuation.

Figure 10: Digital Data Representation Using IBS (Schematic)

I. PRELIMINARY

A list of 2q items are formed that are associated with key numbers ranging from 0 to (2q-1). This key numbers also forms a binary search tree T. To search input item k (AK) out of n input items that are to be searched for (hereafter this n items will be called as SIL – ‘to be searched items list’), our task is to search key associated with that item. BS sets ‘LOWER BOUND’ = LB := 0 and ‘UPPER BOUND’ = UB := (2q-1). Set ‘MID POINT ‘= MP = AD = (LB+UB)/2. MP generates a test point that sets 3 flags L (low), H (high) and F (finished) on comparing AK with AD (L=1, H=0 and F=0 if AK-ε> AD; L=0, H=1 and F=0 if AK+ε< AD; L=0, H=0 and F=1 if AK+ε≥AD≥AK-ε where ε is input signal’s noise margin in case of an A/D – for other applications set ε=0). For L=1 set LB:=MP or for H=1 set UB:=MP and continue setting and compare operation till F=1 that defines search as complete for AK and MP is the key number (for F=1 set RK:=AD). The ‘span’ from UB to LB is a search scope (parent). Resetting of LB/UB as per MP may be called as narrowing down search scope that is generation of a child scope with different span than that of parent scope. After completing search for item k start search for next item (input time division multiplexing). This procedure is an illustration only. Our target is the implementation of BS in hardware using q bit binary register (X and AD). MP=(LB+UB)/2 is implemented on shifting right X by 1 bit and setting AD = AD XOR X. BS start search on setting bit (q-1) of X and shifts right till bit ‘0’ is reached and again starts from bit (q-1) of X to search another input. Initially AD is reset to ‘0’. At a test point for H=1, reset current bit of AD with AD=AD XOR X. IBS instead of restarting on reaching bit 0 starts shift-left and also shift-right if required. At any point IBS compares simultaneously all inputs with AD so generated.

IBS generates MPs on grouping 2q nodes of T into several groups as per input items and applies BS separately to each of these groups. IBS uses some Boolean operations that are Shift-Left by 1 bit (SL1) to multiply by 2, Shift-Right by 1 bit (SR1) to divide by 2 (forms groups), XOR (exclusive-or), && (logical AND) and ++ (logical OR).

III. INTERLACED BINARY SEARCH (IBS)

There are ten (10) figures in this paper (refer table-2). Each figure is divided into eight (8) rows – A, B, C, D, E, F, G, and H, and five (5) columns – 1, 2, 3, 4, and 5. Any location of a particular figure is referenced as /<figure number><row number><column number>, e.g. /7C5 refers figure-7, row-C and column-7. All elements of a figure are tagged for easy reference, e.g. D2/2C2 refers decision box number 2 of figure-2 at row-C column-2. Connectors’ numbering syntax is <originating figure no.>.<terminating figure no.>.<connector no. in originating figure>. Also some parts of flow charts are tagged with corresponding algorithm’s steps’ number. Table-3 correlates IBS algorithm/flowchart with IBS logic of IADC (Fig – 1).

The algorithm searches n items through a binary search tree of 2q nodes. The complete algorithm consists of three phases. Each phase has some steps to run through. Phase-1 (P1) will run first independently to other two phases. After completion of P1, Phase-2 (P2) runs in conjunction with Phase-3 (P3). P2 will call P3, if at any test point it is found that there is (are) item(s) with value(s) less than the value of that test point. P1 and P3 are nothing but binary search itself and continuously narrow down search scope on looking at the middle point (MP) of the present scope. P2 widens search scope or else forms new groups on changing UB. P2 widens scope, on looking at the UB of upper half of present scope, or else looking at the UB of the lower half of the parent scope of present scope. At each test point the value of the node (AD) is compared with the values of items (Ak: k = 1, 2, …, n) and corresponding to each item a set of four binary flags (having values either 1 or 0) are set/reset to value 1/0 that are:

Lk: sets for Ak > AD - e (limit e ® 0) where ε is noise margin of input analog signal of IADC. LK ascertains that Ak is greater than lower limit of AD.

Pk: sets for Ak £ AD - e (limit e ® 0) only during jump from P2 to P3 at change over point (step 23, 24 of P2) if no other Pk is set. Once set Pk will remain set till search of item k completes. If ε=0, set Pk for Ak<AD. [P3/3C4, P4/3D3, P3/6D2, P2/6C5, P4/7B4, P2/8D4]

Hk: sets for Ak > AD + e (limit e ® 0) and is required only during P2 to look higher valued test point. For e=0, Hk=Lk. For e ¹ 0, the condition (all Lk=1) may not confirm that all remaining items in SIL (to be searched item list) have values greater than current test point. But this is confirmed by the condition (all Hk=1). This condition is redundant for widening search scope during phase-2 of the algorithm and may be discarded if all possible looping of P2 is defined properly – further research is to be done for [refer D3/4B2, D4/4D3, D5/4E3 and P4/4E4].

Fk: sets for AD + e ³ Ak ³ AD - e (limit e ® 0), indicates that item k’s search is completed, and disables Lk, Pk and Hk to affect search process, i.e. item k is kept out from SIL. If no other condition prevents, change in AD affects flags. AD is active after first execution of step 12 of P1. [D2/2C3, D1/3B3, D7/5F4, D1/6C2, P5/6F4, D1/7A2, P6/7D3, P3/8E2, D1/9B1]

Steps To Run Through: P1: [Through T, P1 proceeds with BS to search out least valued item in SIL identified with Lk flag. At any test point (or else at any test node of T), if all Lk flags are found set, all items in SIL have values greater than value of test point and search proceeds to right half of test point (right child of test node). Otherwise search moves left half of test point (left child of test node). P1 ends on reaching a leaf of T. 10-th position of step numbering refers corresponding phase number. Each step has reference to equivalent binary operation (using SL1, SR1, XOR, &&, ++) and flow chart that are cited in comment’s section enclosed with […]. X is a temporary variable. ‘b’ is for tracking power of 2 in X, as if X were a binary register. XA [] is a vector of 0 to (q-1) elements and used to track the bits of AD, as if AD were a binary register. In comment section it is assumed that X and AD are 16-bit binary registers i.e. q=16 and at start of P1 both are reset to zero i.e. holds 000016 (16 refers hexadecimal number).]

11: Set X:= 2q-1, b:= (q-1), AD:=0, XA[]=0; [Start; X:=(X XOR 800016), P3/2B4].

12: Set AD := AD + X and XA[b] := 1; [AD:=(AD XOR X). P4/2C4]

12.1: Check AD; [compare present test node of T with SIL to set L, H, F flags. CAS/2D4]

13: If (any Lk=0), Then Set AD := AD – X and XA[b] :=0; [reset bit of AD as pointed by b, as if AD were a binary register (AD:=AD XOR X). D3/2D3, P6/2E3.]

13.1: If ((any Lk=0) && (X=1)), Then Check AD; [Compare node with value 0. To realize in hardware replace this step on assigning value ‘0’ to item(s) with reset L flag(s) and simultaneously setting corresponding F flag(s). This step does not count in trial points for calculating complexity of IBS due to direct hardware realization. D5/6C4, D7/6D4, P4/6E4, D2/8C2, D4/8D2, P3/8E3.]

14: If X ¹ 1, Then Set X := X / 2, b := b-1 and go to step 12; [P1 Ends for X=1. X:= (SR1 X). D4/2F3, P5/2E4.]

[End of phase-1]. [P1E/2G3.]

PHASE II (P2): [P2 starts from the leaf of T where P1 has ended, and searches for items that are in higher order nodes or leaves of T. P2 runs either independently or in conjunction with P3. P2 runs independent to P3, if no Pk flag is set. In BS each test point is MP of present scope. P2 selects UB of present scope as test point. This UB is MP of the parent scope of the present scope. With MP=UB select parent scope as present scope. If any L flag sets with MP as test point, jump to P3. If no L flag is set again set MP=UB of present scope and continue recursively. A Change Over Point (COP) is generated if there is a jump from P2 to P3 and no P flag is set. If IBS is at COP search will continue in the lower half of present list i.e. set UB=MP and continue IBS till there is any P flag is set. ADC, XC, bc and XAC record AD, X, b, and XA respectively at COP. After all items for which P flags are set at COP are looked for, there is a return to COP to continue search in nodes of T that are in higher order than MP of COP generation scope. If there is a COP for which return to COP is yet pending there will be no generation of new COP but P2 and P3 looping is possible.]

21: If XA[b] = 0, Then

1. Set AD := AD + X, XA[b] := 1; [AD:=(AD ++ X)]

2. Check AD; [CAS/4D3.]

else jump to step-25.1 [D2/4B3, P1/4C3]

22: If ((ADC ³ AD) && (all Pk = 0)), Then Set X=:XC, AD:=ADC, b:=bc, XA:=XAC [Return to COP. D3/4B2, P2/4C2. In flow chart flag COP=0 conforms (all Pk = 0) (D1/4B2). The condition (ADC ³ AD) is redundant against premature program discontinuation and may be discarded if all possible looping of P2 and P3 are defined for.]

22.1: Check AD; [Executes on return to COP in step -22. CAS/4D3.]

23: If ((any Lk=0) && (all Pk=0)), Then

1. Set XC:=2*X, ADC:=AD, bc:=b+1, XAC:=XA; [XC:=(SL1 X).]

2. If (XC[bc]=0) Then Set ADC:=ADC+XC, XAC[bc]:=1; [(ADC:=ADC XOR XC)]

[Recording and generation of COP. ‘If ((any Lk=0) && (all Pk=0))’ also sets Pk flags if corresponding Lk=0. P1/5B3, D2/5B4, P2/5C3.]

24: If (any Lk=0), Then jump to phase III. [The condition ‘If ((any Lk=0) && (all Pk=0))’ generates COP and sets Pk flags if corresponding Lk=0. D8/4G3, P3S/4G2.]

25: If (all Hk≠1), Then go to step 26. [D4/4D3.]

25.1: Else Set X := 2 * X, b:=b+1. [(X=SL1 X). D4/4D3, D5/4E2, P3/4E1, P4/4E3.]

26: go to step 21 [End of phase-2.]

PHASE III (P3): [Only during entry to P3 Pk flag(s) can be set if no other Pk is set. In flow charts consequences of setting (D1/5B2, P1/5B3) and resetting (P4/3D3 – all PKs are reset at entry to P2, P2/9D4) of Pk flags are demonstrated with another flag COP.]

31.1: if (all Pk=0), jump to step-22 of phase-II; [D3/5C1, P3R/5D3]

31.2: if (b ¹ 0), Then Set X:=X/2, b:=b-1; [P6/5D1.]

else jump to step-21 of phase-II; [P2 and P3 looping inside a COP. D4/5D1, P3R/5D3.]

32: set AD := AD – X, XA[b] := 0. [Clearing a particular bit (b) of AD i.e. AD:=(AD XOR X). Figure reference P3/5E1]

32.1: check AD; [CAS/5E3]

33: If((all Lk = 1) && (any Pk = 1)), Then Set AD := AD + X, XA[b] := 1. [(AD := AD XOR X). 8/5G5, D5/5F2, P4/5G2. ]

34: go to step 31. [End of phase-3.]

Now we are giving a comparative study between IBS and BS. This in no means relates any other search algorithm.

THEOREM I: Interlaced Binary Search will consume [5.2(q-2)-2.q+2] trials to search out simultaneously all 2(q-1) leaves of a complete binary search tree of depth q formed from first (2q – 1) natural numbers ranging from 1 to (2q-1).

Proof: 2(q-1) leaves are formed from all odd numbers in the range 1 to (2q – 1). Phase-1 of IBS searches out only the least valued leaf (value 1) using q trials – the number of time AD is changed in step-12.

In P2 number of trial points is equal to the number of times AD is changed in step-21 or step-22. In P3 number of trial points is equal to the number of times AD is changed in step-32. On a jump to P3 from P2, maximum possible looping in P3 is possible on maximum number of execution of step-31.2. The value of this maximum looping is equal to value of ‘b’ just at the entry of P3. This condition excludes possibility of return to COP (refer step-22). First check point (C1) will come at the leaf of value ‘7’. C1 will look for leaf with value ‘5’. Second COP (C2) will come at leaf of value ‘15’ and will search for leaves with values ‘9’, ‘11’, ‘13’. Similarly third COP (C3) will come at (25-1) and COP CZ will come at (2z+2-1). At any COP CZ number of leaves to be searched is 2z including COP generation leaf. COP C0 comes at (22-1) has only one leaf (valued ‘3’) to be searched for. C0 does not call for a jump to P3. C0 is the first trial point in P2. If from any point of an execution of P3 control shifts to ‘return to COP’ (step22), maximum possible run time (say p) of P3 during that execution is equal to the number of times AD is changed in step32. To search all 2z leaves of CZ, total number of trials or else run time is calculated as follows.

Values of leaves pertaining to any COP (CZ) are values of leaves found earlier to that COP shifted with 2(z+1), i.e. values of leaves pertaining to CZ (including COP generation leaf) are: (2(z+1)+20), (2(z+1)+20+21), (2(z+1)+20+22), (2(z+1)+20+21+22),……etc. Here to mention that the precondition of being a leaf in complete binary search tree T is inclusion of 20 in calculation of value of any leaf. So to find trial points to search 2z leaves pertaining to CZ we can move recursively. On a jump to P3 from P2, maximum possible looping in P3 is possible on maximum number of execution of step-31, and that is equal to value of ‘b’ just at the entry of P3. This condition excludes possibility of return to COP (refer step-22). To count recursively, each COP (CZ) is paired with one run of P3 with maximum possible looping. This maximum possible looping for a CZ & P3 pair will be again maximum if b=(z+1) just at the entry of P3 or else just before first execution of step-31. If this pairing is not followed with return to COP (refer step-22), there will be one trial point on return to P2 after setting AD=AD+X in step-33 following step-32 execution for setting AD=AD-X while b=0. So total number of trial points including COP generation trial point in P2, pertaining to one CZ_P3 pairing is (z+1+1+1) or (z+3). If one CZ_P3 pairing is followed with return to COP, total number of trial points for that pair will be (z+1). This is because return to COP will be activated during looping in P3 with b=1, i.e. trial point for b=0 in P3 will be absent.

Let us identify this CZ & P3 pairing without return to COP in follow up as CZP and with return to COP in follow up as CZR.

Trials points for C1P are (22+21+20)P2 , (22+20)P3, (22+21)P3, (22+21+20)P2. [Subscript P2 or P3 indicates generation phase.]

Trials points for C1R are (22+21+20)P2, (22+20)P3.

Trials points for C2P are (23+22+21+20)P2, (23+21+20)P3, (23+20)P3, (23+21)P3, (23+21+20)P2.

Trials points for C2R are (23+22+21+20)P2, (23+21+20)P3, (23+20)P3.

Trial Points for C3P are (24+23+22+21+20)P2, (24+22+21+20)P3, (24+21+20)P3, (24+20)P3, (24+21)P3, (24+21+20)P2.

Trial points for CZP are (2(z+1)+2z+2(z-1)+…+20)P2, (2(z+1)+ 2(z-1)+ 2(z-2)+…+20)P3, (2(z+1)+ 2(z-2)+ 2(z-3)+…+20)P3, (2(z+1)+ 2(z-3)+ 2(z-4)+…+20)P3,……, (2(z+1)+21+20)P3, (2(z+1)+20)P3, (2(z+1)+21)P3, (2(z+1)+21+20)P2.

Trial points for CZR are (2(z+1)+2z+2(z-1)+…+20)P2, (2(z+1)+ 2(z-1)+ 2(z-2)+…+20)P3, (2(z+1)+ 2(z-2)+ 2(z-3)+…+20)P3, (2(z+1)+ 2(z-3)+ 2(z-4)+…+20)P3,……, (2(z+1)+21+20)P3, (2(z+1)+20)P3.

So total number of trial points of CZ is defined recursively as ( CZ, CZP, CZR mark total number of trial points present in CZ, CZP, CZR, respectively):

(at any point value of b = z+1)

C1= C1R = 1+1=2;

C2 = C2P + C1R = (2+3) + 2 =7;

C3 = C3P + (C2 + 2) + C1R = C3P + C2P + C1P + C1R = C3P + C2P + C1P + C1P -2 = 17;

[(C2 + 2) comes as it is not followed with return to COP. C1R+2=C1P.]

C4 = C4P + (C3+2) + (C2+2) + C1R =C4P +C3P + 2C2P + 4C1P -2 = 37;

CZ = (z+3) + z(1+2+22+23+…+2(z-2)) – (1+2.2+3.22+4.23+…+(z-1).2(z-2)) + 3(1+2+22+23+…+2(z-2)) -2 = 5.2(z-1) – 3; ……(A)

Average Trial points per leaf of CZ = 5/2 – 3.2-z ≈ 5/2 for limit z → ∞; ……(B)

So total run time to search 2q-1 leaves of complete binary search tree T of 2q nodes with IBS = trial points during P1 + first trial point in P2 before C1 + cumulative trial points of all check points ranging from C1 to C(q-2) = q + 1 + 5.(2(q-2)-1) – 3.(q-2-1+1)

= 5.2(q-2) – 2.q + 2 = LIBS (say); [theorem-1 is proved]

Average run time per leaf = 5/2 – (q/2(q-2)) + 2(2-q) ≈ 5/2 for limit q → ∞; ……(C)

With reference to section-2 total run time to search 2q-1 leaves of complete binary search tree T of 2q nodes with BS = q.2q-1 = LBS (say);

LBS – LIBS = q.2q-1 -5.2q-2 + 2.q – 2 = (2(q-2) + 1) (2.q – 5) + 3; ……(D)

So (LBS – LIBS) will go on increasing as q increases. This is conditional. As for example to search simultaneously 2 nodes with values 1 and (2(q-1)+ 2(q-2)) of T, IBS will require (3.q-2) trials in comparison to (q+2) trials of binary search. This also clarifies that the efficiency (number of trials in comparison to binary search) of IBS will go up as number of inputs goes up and also the deviation from mean of all items is low i.e. items are closely spaced in the binary search tree.

Worst Case of IBS comes when there is requirement of simultaneous searching of 2 nodes with values 1 and (2(q-1)+ 2(q-2)). Worst case run time is (3q-2). But to mention that IBS is designed to search for multiple items (much more than 2).

Best Case of IBS will be n. That is one trial per item search. There are several possible best cases. We are noting here two possible best cases.

1. n = 2.q-1. Each trial point of P1 and P2 will correspond to an item and there will be no generation of COP; i.e. there will be no execution of P3. This n items are: 2q-1, 2q-2, 2q-3, …, 20, (20 + 21), (20 + 21 + 22), (20 + 21 + 22 + 23), …, (20 + 21 + …+ 2q-1).

2. n = (2q-1) + (q2-3q+2)/2 = (q2 +q)/2. Let us associate generation of COP in above best case. We will take maximum possible COP generation. Each COP CZ will search for z items with values – (2z+1 + 2z-1 + 2z-2 + 2z-3 + …+ 20), (2z+1 + 2z-2 + 2z-3 + 2z-4 + …+ 20), (2z+1 + 2z-3 + 2z-4 + 2z-5 + …+ 20), …, (2z+1 + 20). There will be (q-2) COP ranging from C1 to Cq-2.

IV. DIGITAL DATA COMPRESSION USING IBS

This is a scheme for data compression (may be termed as Binary Data Representation on Simultaneous Multiple Items’ Binary Search) on convergence in analog and digital computations. The scheme (recursion is possible) may be incorporated directly either in hardware (as micro-programmed control micro-architecture [5]) or in software. The nk bytes of a binary file are grouped sequentially into n groups of k bytes each. Each group is numbered sequentially (call this as POS because POS gives the information of position) with the first n natural numbers starting from 0 to (n-1). In the proposed scheme instead of passing nk bytes for storage or transmission, the positions’ information of these n groups in the binary search tree formed by the first 28k natural numbers starting from 0 to (28k -1) along with POS is passed using fewer bits. This asynchronous communication may be looked as a modulation scheme for processing discrete and quanta of analog signals – termed as quanta bytes (neither continuous nor defined by two states 1 or 0) – similar to that of processing continuous analog signal through adaptive delta modulation.

With reference to figure-10 for a binary file of nk bytes each byte {B7, B6, B5, B4, B3, B2, B1, B0} is converted into analog signal equivalent Vxy (where x = 0,1, 2, (n-1) and y = 0,1, 2, …, (k-1)) using equation-1. This Vxy analog signal equivalent is used to form Vq (q = 0, 1, 2, …, (n-1)) analog quanta – termed as Qb (Quantized byte) in the number system of base 28 as per equation – 2.

Vxy = A [B7 * 27 + B6 * 26 + B5 * 25 + B4 * 24 + B3 * 23 + B2 * 22 + B1 * 21 + B0] ---- equation (1) [A is an arbitrary analog conversion constant and may be adjusted to any value]

Vq = B[Vq(k-1) * 28(k-1) + Vq(k-2) * 28(k-2) +….+ Vq0] ---- equation (2) [B is an arbitrary analog conversion constant and may be adjusted to any value]

These n Qbs will match with some nodes of the binary search tree formed with the first 28k natural numbers starting from 0 to (28k -1). The interlaced binary search (IBS) algorithm will convert these Qbs into equivalent serial bit stream. This will resemble to adaptive delta modulation [6, 7] for digital signal processing. This serial bit stream can be either stored in a serial read / write memory or transmitted through an asynchronous serial communication link. A proxy interlaced binary search algorithm will run at the receiver that will accept the serial bit stream for mimicking sender’s IBS to generate Qbs that again will be converted to the original sequential bytes through a process termed as dequantization (DQZ). As QTZ will group k bytes in sequence to form a Qb as a binary number of 8k bits with reference to base 28 number system, DQZ will merely be the de-grouping of 8k bits of a recovered Qb into sequential k bytes. For this a Qb is a group defined by the set: Qb = {Bi : i = 0, 1, …, (k-1)} where Bi stands for a byte.

Adaptive delta modulation Using IBS: Adaptive delta modulation (ADM) [6, 7] is a scheme of differential pulse code modulation (DPCM). In DPCM, instead of transmitting a base band signal m(t) on sampling and binary coding, at each sampling time (say time k) we transmit the difference between the sample value m(k) at sampling time k and the sample value m(k-1) at time (k-1). Obviously this difference will call for fewer bits for transmission. On the receiving end, the addition of this difference values will regenerate the actual signal. The number of bits in DPCM can again be reduced through delta modulation (DM). In DM m(k) is compared with regenerated signal up to m/(k-1). This comparison facilitates just single bit transmission for just two possibilities – increase or decrease regenerated m/(k). DM has the limitation, that to go with large change in original signal there is requirement of high sampling rate; that again is limited by Nyquist rate. Nyquist rate is the minimum allowable sampling rate 2fM for exact recovery of a bandlimited signal with fM as highest frequency spectral component. ADM overcomes this drawback on increase or decrease of quantization step size with fixed sampling rate.

To delta modulate n Qbs discrete states (let us say, these states be formed from an analog signal on sample and hold) following points are to be considered.

1. Each of n Qbs is assigned with a number in the range 0 to (n-1) to track the grouping sequence of nk bytes into n Qbs each of k bytes, e.g. 1st group is assigned with 0, 2nd group is assigned with 1 and so on. Call these numbers as POS and a POS needs log2n bits to represent a number.

2. In the process of IBS (through binary search tree of 28k nodes) when there is a match (i.e. for which flag Fk is set) the corresponding POS is to be transmitted instead of 8k bits.

3. During any phase of IBS, if value of AD matches with

that of any one or more Qbs, following sequence of bits will be transmitted.

M1 = 0 indicates 1 item match and will be followed with bits of corresponding POS.

M1=1 indicates more than 1 item match i.e. Qbs of equal value and will be followed with bit M2.

M2=0 indicates number of Qbs in match is less than 16 and will be followed by a nibble that will give the number of Qbs ; that again will be followed by the POSs of Qbs for which match is successful.

M2=1 indicates number of Qbs in match is more than 15 and will be followed by a group of log2n bits to mark the number; that again will be followed by the POSs of Qbs for whose match is successful.

4. During phase-1 of IBS for no match case 2 bits are to be transmitted corresponding to one sequencing from step 12 through step 14 that are:

P10 = 0 if step 13 sets XA[b] = 0.

P10 = 1 if step 13 leaves XA[b] = 1.

P11 = 0 for no successful match

P11 =1 for successful match and will be followed with the match as indicated above in paragraph #3.

5. During phase-2 of IBS for no match case 3 bits are to be transmitted corresponding to the sequencing from step21 through step 26:

P20 = 0 if step 24 does not allow jump.

P20 = 1 if step 24 allows jump.

P21 = 0 for no successful match

P21 =1 for successful match and will be followed with the match as indicated above in paragraph #3.

P22 = 1 if step 23 records COP and P20=1 [generation of COP must be followed by jump to phase-3].

P22 = 1 if step 22 returns to COP [once a COP is generated, till its return another COP can not be generated]. If preceded by P20=1, it will be implied that there is generation of another COP just after return from one COP.

P22 = 0 before return to COP as in step 22.

P22 = invalid after return from COP till generation of another COP. Validity starts after P20 =1 followed by P22=1.

6. During phase-3 of IBS for no match case 2 bits are to be transmitted corresponding to one sequencing from step 31 through step34 that are:

P30 = 0 if step 32 sets XA[b] = 0.

P30 = 1 if step 33 leaves XA[b] = 1.

P31 = 0 for no successful match [there will be Pk = 1]

P31 =1 for successful match and will be followed with the match as indicated above in paragraph #3.

P32 [for returning from phase-3]= invalid if P31= 0 [return from phase-3 is possible on either of two causes – (i) X=1 i.e. b=0, (ii) no Pk = 1 (no more phase-3 sequencing is required and return to COP)].

P32 = 0 if P31=1 and all Pk = 0 [return to phase-2].

P32 = 1 if P31=1 and any Pk = 1 [no return to phase-2].

7. To read the asynchronously transmitted serial bits the receiver will do:

(i) Running of an IBS through a binary search tree of 28k nodes.

(ii) Sequencing through the binary search tree is to be done as per information available through bits P10, P20, P22, P30, P31 and P32.

(iii) Identification of matching nodes through P11, P21, P31 and match bits defined in paragraph 3 above. Receiver on identification of Qbs will extract constituting bytes through dequantization.

Compression Ratio Calculation: To calculate compression efficiency of the presented scheme arbitrarily consider following discussion.

Total number of Qb = 2n ……(A)

Each POS for sequentially numbering a Qb needs n bits. So total bits required by 2n

POS is (n * 2n ). ……(B)

Total trial for 2n Qb with p trials per Qb = p * 2n …(C) [trial for recovering delta modulated Qbs by proxy-IBS]

Total bits required = (n.2n ) + (3.p.2n ) …(D) [from (B) & (C); with 3 bits per trial]

Total bits of original binary file = k.2n+3 …(E) [k bytes per Qb]

Compression Ratio (CR) = (D) / (E) = {(n.2n ) + (3.p.2n )}/ k.2n+3 = (n/k).2-3 + (3.p/k).2-3

Let us calculate CR for n = k = p = 16; that turn out 0.5 i.e. doubling of storage space or transmission speed. Here to mention with reference to equation (C) of section-3 that p is limited to a maximum value 5/2 for searching all leaves of the binary search tree.

With this it is evident that CR is mainly dependent on the value of p. In IBS p is not fixed but variable. Proper choice of k is a factor to keep p low on keeping deviation among the values of Qb low. To keep the deviation low there may be inclusion of error correcting bytes in the Qb that is deviating a large. Also the arbitrary choice of 3 bits per trial may be reduced further a lot on the possibility of use of bit P10 without accompanying P11 only in some restriction during adaptive delta modulation e.g. replace serial bit stream with serial byte stream and keep most significant bit of a byte reset if subsequent seven trials do not produce a match (this consideration may be extended during phase-2 in other form). For reducing p there may involve combination of Qbs originating from multiple binary files and identified with specific byte in specific position during simultaneous coding. Also there is possibility of recursion to improve CR further. So the system is looking for a lot of applications on proper adaptations of p, k and bits per trial (further research is to be looked for). The factor {(n/k).2-3 } may be eliminated on absorption of POS in Qb, e.g. as first ‘a’ bytes will hold POS. In this case each Qb will be of a unique value; the only information of a successful match (P11, P21 & P31) is to be transmitted without any subsequent match bits as defined by ‘M1’, ‘M2’ and address bits.

V. CONCLUSION

I was curious about possibility of left shifting the binary registers X and AD in the context of successive approximation A/D with input time division multiplexing and developed IBS and IADC. Thereafter the data compression scheme is developed. None of these developments have any prior reference and developed from foundation of electronics and communication engineering. IADC will find application in real time analog process control. The data compression algorithm needs further standardization for use in practical data storage and transmission applications.

References

[1] Sarkar Aloke “Hardware realization of high level language processor for analog process control”, All India Seminar on ‘Application of Evolutionary Strategies to Power, Signal Processing and Control’ (AES-2002), Rourkela, February 2002, pp 122-124.

[2] Sarkar Aloke “Analog to digital converter with software multiplexing”, ‘Emerging Trends in Mechatronics for Automation’, proceedings 18th National Convention of Mechanical Engineers, Rourkela, Phoenix Publishing House Pvt. Ltd., November 2002, pp 420-429.

[3] Sarkar Aloke “RXP: the hardware realization of high level interpreter” section X, 37th National Convention 2002 of Computer Society of India, Bangalore, Tata McGraw Hill, October 2002, pp 325, section – hardware, Sl. No. 133, sub-heading— X. Ait Quantizer, page–7.

[4] Sarkar Aloke “Hardware realization of level 5 virtual machine on Convergence in analog & digital computations” International Conference on Information Technology: Prospects & Challenges, 2003, Nepal.

[5] Tanenbaum Andrew S. “Structured Computer Organization” India, Prentice-Hall, Inc.,

[6] Taub Herbert and Schilling Donald L. “Principles of Communication Systems” Singapore, McGraw-Hill Book Company 1986.

[7] Carlson A. Bruce “Communication System” Singapore, McGraw-Hill Book Company 1986.