Big Code Search: A Bibliography Kim, Kisub; Ghatpande, Sankalp; Kim, Dongsun ...
ACM computing surveys,
01/2024, Volume:
56, Issue:
1
Journal Article
Peer reviewed
Open access
Code search is an essential task in software development. Developers often search the internet and other code databases for necessary source code snippets to ease the development efforts. Code search ...techniques also help learn programming as novice programmers or students can quickly retrieve (hopefully good) examples already used in actual software projects. Given the recurrence of the code search activity in software development, there is an increasing interest in the research community. To improve the code search experience, the research community suggests many code search tools and techniques. These tools and techniques leverage several different ideas and claim a better code search performance. However, it is still challenging to illustrate a comprehensive view of the field considering that existing studies generally explore narrow and limited subsets of used components. This study aims to devise a grounded approach to understanding the procedure for code search and build an operational taxonomy capturing the critical facets of code search techniques. Additionally, we investigate evaluation methods, benchmarks, and datasets used in the field of code search.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, SAZU, UL, UM, UPUK
2.
The minimum locality of linear codes Tan, Pan; Fan, Cuiling; Ding, Cunsheng ...
Designs, codes, and cryptography,
2023/1, Volume:
91, Issue:
1
Journal Article
Peer reviewed
Locally recoverable codes (LRCs) were proposed for the recovery of data in distributed and cloud storage systems about nine years ago. A lot of progress on the study of LRCs has been made by now. ...However, there is a lack of general theory on the minimum locality of linear codes. In addition, the minimum locality of many known families of linear codes has not been studied in the literature. Motivated by these two facts, this paper develops some general theory about the minimum locality of linear codes, and investigates the minimum locality of a number of families of linear codes, such as
q
-ary Hamming codes,
q
-ary Simplex codes, generalized Reed-Muller codes, ovoid codes, maximum arc codes, the extended hyperoval codes, and near MDS codes. Many classes of both distance-optimal and dimension-optimal LRCs are presented in this paper. To this end, the concepts of linear locality and minimum linear locality are specified. The minimum linear locality of many families of linear codes are settled with the general theory developed in this paper.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
In his pioneering work on LDPC codes, Gallager dismissed codes with parity-check matrices of weight two after proving that their minimum Hamming distances grow at most logarithmically with their code ...lengths. In spite of their poor minimum Hamming distances, it is shown that quasi-cyclic LDPC codes with parity-check matrices of column weight two have good capability to correct phased bursts of erasures which may not be surpassed by using quasi-cyclic LDPC codes with parity-check matrices of column weight three or more. By modifying the parity-check matrices of column weight two and globally coupling them, the erasure correcting capability can be further enhanced. Quasi-cyclic LDPC codes with parity-check matrices of column weight three or more that can correct phased bursts of erasures and perform well over the AWGN channel are also considered. Examples of such codes based on Reed-Solomon and Gabidulin codes are presented.
Keywords: Binary linear block code Coding theory Error-correcting code Golay code Multi-kernel polar code ABSTRACT This paper describes an adaptation of a polar code decoding technique in favor of ...the extended Golay code. When the positions of the frozen bits are fixed to 0, and the source information bits are organized in the remaining positions, to form the source vector u. The source codeword x is obtained via the encoding process applied on u by: x = u ¦ Gp, so that Gp is the kronecker product of order log2 (JV) of the Arikan kernel, as a generator matrix of P. The SC approach, which can be described as a binary tree, as the first polar decoding algorithm developed. Because no frozen bit nodes are providing prior information, traversing subtree yields no additional information. ...it is sufficient to estimate the leaf bits at the current node. c) Repetition (REP) node: it is the root node of a subtree, with all leaf nodes, are frozen except the last node which is an information bit node as illustrated in Figure 2. d) Single parity check (SPC) node: the root node of a subtree whose leaf nodes are all information bit nodes except the first, which is a frozen bit node corresponds to the SPC node as illustrated in Figure 2.
Full text
Available for:
FFLJ, IZUM, KILJ, NUK, ODKLJ, PILJ, PNG, SAZU, UL, UM, UPUK
I analyzed all the theories and models of the origin of the genetic code, and over the years, I have considered the main suggestions that could explain this origin. The conclusion of this analysis is ...that the coevolution theory of the origin of the genetic code is the theory that best captures the majority of observations concerning the organization of the genetic code. In other words, the biosynthetic relationships between amino acids would have heavily influenced the origin of the organization of the genetic code, as supported by the coevolution theory. Instead, the presence in the genetic code of physicochemical properties of amino acids, which have also been linked to the physicochemical properties of anticodons or codons or bases by stereochemical and physicochemical theories, would simply be the result of natural selection. More explicitly, I maintain that these correlations between codons, anticodons or bases and amino acids are in fact the result not of a real correlation between amino acids and codons, for example, but are only the effect of the intervention of natural selection. Specifically, in the genetic code table we expect, for example, that the most similar codons - that is, those that differ by only one base - will have more similar physicochemical properties. Therefore, the 64 codons of the genetic code table ordered in a certain way would also represent an ordering of some of their physicochemical properties. Now, a study aimed at clarifying which physicochemical property of amino acids has influenced the allocation of amino acids in the genetic code has established that the partition energy of amino acids has played a role decisive in this. Indeed, under some conditions, the genetic code was found to be approximately 98% optimized on its columns. In this same work, it was shown that this was most likely the result of the action of natural selection. If natural selection had truly allocated the amino acids in the genetic code in such a way that similar amino acids also have similar codons - this, not through a mechanism of physicochemical interaction between, for example, codons and amino acids - then it might turn out that even different physicochemical properties of codons (or anticodons or bases) show some correlation with the physicochemical properties of amino acids, simply because the partition energy of amino acids is correlated with other physicochemical properties of amino acids. It is very likely that this would inevitably lead to a correlation between codons (or anticodons or bases) and amino acids. In other words, since the codons (anticodons or bases) are ordered in the genetic code, that is to say, some of their physicochemical properties should also be ordered by a similar order, and given that the amino acids would also appear to have been ordered in the genetic code by selection natural, then it should inevitably turn out that there is a correlation between, for example, the hydrophobicity of anticodons and that of amino acids. Instead, the intervention of natural selection in organizing the genetic code would appear to be highly compatible with the main mechanism of structuring the genetic code as supported by the coevolution theory. This would make the coevolution theory the only plausible explanation for the origin of the genetic code.
•I analyzed all the theories and models of the origin of the genetic code, and over the years, I have considered the main suggestions that might explain this origin.•This analysis concludes that the coevolution theory of the origin of the genetic code is the theory that best captures the majority of observations concerning the organization of the genetic code.•That is, the biosynthetic relationships between amino acids would have heavily influenced the origin of the organization of the genetic code, as supported by the coevolution theory.•Instead, the presence in the genetic code of physicochemical properties of amino acids, which have also been linked to the physicochemical properties of anticodons or codons or bases by stereochemical and physicochemical theories, would simply be the result of natural selection.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
There are exactly two non-commutative rings of size 4, namely, <inline-formula> <tex-math notation="LaTeX">E = \langle a, b \vert 2a = 2b = 0, a^{2} = a, b^{2} = b, ab= a, ba = b\rangle ...</tex-math></inline-formula> and its opposite ring <inline-formula> <tex-math notation="LaTeX">F </tex-math></inline-formula>. These rings are non-unital. A subset <inline-formula> <tex-math notation="LaTeX">D </tex-math></inline-formula> of <inline-formula> <tex-math notation="LaTeX">E^{m} </tex-math></inline-formula> is defined with the help of simplicial complexes, and utilized to construct the linear left-<inline-formula> <tex-math notation="LaTeX">E </tex-math></inline-formula>-code <inline-formula> <tex-math notation="LaTeX">C^{L}_{D}=\{(v\cdot d)_{d\in D}: v\in E^{m}\} </tex-math></inline-formula> and the right-<inline-formula> <tex-math notation="LaTeX">E </tex-math></inline-formula>-code <inline-formula> <tex-math notation="LaTeX">C^{R}_{D}=\{(d\cdot v)_{d\in D}: v\in E^{m}\} </tex-math></inline-formula>. We study a certain binary subfield-like code corresponding to <inline-formula> <tex-math notation="LaTeX">C_{D}^{L} </tex-math></inline-formula>. By using a Gray map, we also obtain the binary Gray images of <inline-formula> <tex-math notation="LaTeX">C_{D}^{L} </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">C_{D}^{R} </tex-math></inline-formula>. The weight distributions of all these codes are computed. We achieve a couple of infinite families of optimal codes with respect to the Griesmer bound. Ashikhmin-Barg's condition for minimality of a linear code is satisfied by most of the binary codes we constructed here. All the binary codes in this article are self-orthogonal and few-weight codes under certain mild conditions. This is the first attempt to study the structure of linear codes over a non-unital non-commutative ring using simplicial complexes.
New MDS Euclidean Self-Orthogonal Codes Fang, Xiaolei; Liu, Meiqing; Luo, Jinquan
IEEE transactions on information theory,
2021-Jan., 2021-1-00, Volume:
67, Issue:
1
Journal Article
Peer reviewed
Open access
In this paper, two criterions of MDS Euclidean self-orthogonal codes are presented. New MDS Euclidean self-dual codes and self-orthogonal codes are constructed via the criterions. In particular, ...among our constructions, for large square <inline-formula> <tex-math notation="LaTeX">q </tex-math></inline-formula>, about <inline-formula> <tex-math notation="LaTeX">\frac {1}{8}\cdot q </tex-math></inline-formula> new MDS Euclidean (almost) self-dual codes over <inline-formula> <tex-math notation="LaTeX"> \mathbb {F}_{q} </tex-math></inline-formula> can be produced. Moreover, we can construct about <inline-formula> <tex-math notation="LaTeX">\frac {1}{4}\cdot q </tex-math></inline-formula> new MDS Euclidean self-orthogonal codes with different even lengths <inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula> and dimension <inline-formula> <tex-math notation="LaTeX">\frac {n}{2}-1 </tex-math></inline-formula>.
Canada has two national civil codes of practice that include geotechnical design provisions: the National Building Code of Canada and the Canadian Highway Bridge Design Code. For structural designs, ...both of these codes have been employing a load and resistance factor format embedded within a limit states design framework since the mid-1970s. Unfortunately, limit states design in geotechnical engineering has been lagging well behind that in structural engineering for the simple fact that the ground is by far the most variable (and hence uncertain) of engineering materials. Although the first implementation of a geotechnical limit states design code appeared in Denmark in 1956, it was not until 1979 that the concept began to appear in Canadian design codes, i.e., in the Ontario Highway Bridge Design Code, which later became the Canadian Highway Bridge Design Code (CHBDC). The geotechnical design provisions in the CHBDC have evolved significantly since their inception in 1979. This paper describes the latest advances appearing in the CHBDC along with the steps taken to calibrate its recent geotechnical resistance and consequence factors.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
We used the Moran's I index of global spatial autocorrelation with the aim of studying the distribution of the physicochemical or biological properties of amino acids within the genetic code table. ...First, using this index we are able to identify the amino acid property - among the 530 analyzed - that best correlates with the organization of the genetic code in the set of amino acid permutation codes. Considering, then, a model suggested by the coevolution theory of the genetic code origin - which in addition to the biosynthetic relationships between amino acids took into account also their physicochemical properties - we investigated the level of optimization achieved by these properties either on the entire genetic code table, or only on its columns or only on its rows. Specifically, we estimated the optimization achieved in the restricted set of amino acid permutation codes subject to the constraints derived from the biosynthetic classes of amino acids, in which we identify the most optimized amino acid property among all those present in the database. Unlike what has been claimed in the literature, it would appear that it was not the polarity of amino acids that structured the genetic code, but that it could have been their partition energy instead. In actual fact, it would seem to reach an optimization level of about 96% on the whole table of the genetic code and 98% on its columns. Given that this result has been obtained for amino acid permutation codes subject to biosynthetic constraints, that is to say, for a model of the genetic code consistent with the coevolution theory, we should consider the following conclusions reasonable. (i) The coevolution theory might be corroborated by these observations because the model used referred to the biosynthetic relationships between amino acids, which are suggested by this theory as having been fundamental in structuring the genetic code. (ii) The very high optimization on the columns of the genetic code would not only be compatible but would further corroborate the coevolution theory because this suggests that, as the genetic code was structured along its rows by the biosynthetic relationships of amino acids, on its columns strong selective pressure might have been put in place to minimize, for example, the deleterious effects of translation errors. (iii) The finding that partition energy could be the most optimized property of amino acids in the genetic code would in turn be consistent with one of the main predictions of the coevolution theory. Since the partition energy is reflective of the protein structure and therefore of the enzymatic catalysis, the latter might really have been the main selective pressure that would have promoted the origin of the genetic code. Indeed, we observe that the β-strands show an optimization percentage of 95.45%; so it is possible to hypothesize that they might have become the object of selection during the origin of the genetic code, conditioning the choice of biosynthetic relationships between amino acids. (iv) The finding that the polarity of amino acids is less optimized than their partition energy in the genetic code table might be interpreted against the physicochemical theories of the origin of the genetic code because these would suggest, for example, that a very high optimization of the polarity of amino acids in the code could be an expression of interactions between amino acids and codons or anticodons, which would have promoted its origin. This might now become less sustainable, given the very high optimization that is instead observed in favor of the partition energy but not polarity. Finally, (v) the very high optimization of the partition energy of amino acids would seem to make a neutral origin of error minimization, i.e. of the ability of the genetic code to buffer, for example, the deleterious effects of translation errors, very unlikely. Indeed, an optimization of about 100% would seem that it might not have been achieved by a simple neutral process, but this ability should probably have been generated instead by the intervention of natural selection. In actual fact, we show that the neutral theory of the origin of error minimization has been falsified for the model analyzed here. Therefore, we will discuss our observations within the theories proposed to explain the origin of the organization of the genetic code, reaching the conclusion that the coevolution theory is the most strongly corroborated theory.
•We used the Moran's I index of global spatial autocorrelation with the aim of studying the distribution of the physicochemical or biological properties of amino acids within the genetic code table;•Considering a model suggested by the coevolution theory of the genetic code origin - which in addition to the biosynthetic relationships between amino acids took into account also their physicochemical properties - we investigated the level of optimization achieved by these properties in the genetic code table;•Unlike what has been claimed in the literature, it would appear that it was not the polarity of amino acids that structured the genetic code, but that it could have been their partition energy instead;•It would seem that the partition energy of amino acids has reached an optimization level of about 96% on the whole table of the genetic code and 98% on its columns.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
A novel scheme is presented for encoding and iterative soft-decision decoding of cyclic codes of prime lengths. The encoding of a cyclic code of a prime length is performed on a collection of ...codewords which are mapped through Galois Fourier transform into a codeword in a low-density parity-check code with a binary parity-check matrix for transmission. Using this matrix, binary iterative soft-decision decoding algorithm is applied to jointly decode a collection of codewords from the cyclic code. The joint-decoding allows for information sharing among the received vectors corresponding to the codewords in the collection during the iterative decoding process. For decoding Reed-Solomon and BCH codes of prime lengths, the proposed decoding scheme not only requires much lower decoding complexity than other soft-decision decoding algorithms for these codes, but also yields superior performance. The proposed decoding scheme can also achieve a joint-decoding gain over the maximum likelihood decoding of individual codewords. The decoding scheme is also applied to quadratic residue codes.