Heating, Ventilation, and Air Conditioning (HVAC) systems are crucial for providing thermal comfort and maintaining indoor air quality. Demand-Controlled Ventilation (DCV), as a type of HVAC system, ...optimizes fresh air supply based on occupant's demands while integrating with the heating system to ensure comfort and energy efficiency. However, modeling DCV and heating systems for large-scale buildings poses challenges due to complexity, limited data availability, time consumption, resource intensity, and lack of standardization. Moreover, system complexity makes HVAC systems prone to faults, resulting in increased energy consumption, discomfort, poor indoor air quality, and infrastructure risks. A generic Fault Injection (FI) framework that supports multiple fault injections in realistic scenarios is an effective tool for evaluating the reliability of these complex systems. The design challenge here is the complexity of interconnected systems and components. This paper resolves this challenge by introducing a component-based system model design integrated with an automatic multiple-fault injection framework. This approach enhances the efficiency of DCV and heating systems, reducing energy waste, improving indoor air quality, and lowering maintenance costs. The validation phase of the research demonstrates an accurate component-based system model configuration ensembles for multiple fault scenarios at different attributes.
In the financial market, Bitcoin analytics has gained lots of attention due to its high-risk high-reward nature. It is interesting to find better techniques to analyze and predict the Bitcoin price ...change. In this paper, we propose Bitcoin Evolution Analytics, which aims to predict the Bitcoin price change after one hour as Bearish or Bullish. For the prediction, the approach combines the Sentiment analysis and the Technical indicators. For Sentiment analysis of tweets related to Bitcoin, the approach uses three Natural Language Processing (NLP) libraries, namely VADER, FinBERT, and TextBlob, which generated eight different sentiment scores. For Technical indicators, the approach used three features of Bitcoin: User Sentiment Score, Aroon Indicators, and Accumulation/Distribution Line Indicators. We represented all these features of Bitcoin Data over time, which created a novel Bitcoin State Series. To predict the price change of the next hour as Bearish or Bullish, we built the state series for each hour of continuous 13 months (March 2021 - March 2022). To find the most reliable set of features, we have trained 27 ML models. For each feature set, we compared the average and maximum of the accuracies and f-measures. The results of our experiment show that considering the followers of the user as the "weight" of the sentiment gives a more accurate prediction. We found that a combination of Sentiment Analysis and Technical Indicators performs better than using only Sentiment Analysis.
We consider the problem of determining the zero-error list-decoding capacity
of the $q/(q-1)$ channel studied by Elias (1988). The $q/(q-1)$ channel has
input and output alphabet consisting of $q$ ...symbols, say, $Q =
\{x_1,x_2,\ldots, x_q\}$; when the channel receives an input $x \in Q$, it
outputs a symbol other than $x$ itself. Let $n(m,q,\ell)$ be the smallest $n$
for which there is a code $C \subseteq Q^n$ of $m$ elements such that for every
list $w_1, w_2, \ldots, w_{\ell+1}$ of distinct code-words from $C$, there is a
coordinate $j \in n$ that satisfies $\{w_1j, w_2j, \ldots,
w_{\ell+1}j\} = Q$. We show that for $\epsilon<1/6$, for all large $q$ and
large enough $m$, $n(m,q, \epsilon q\ln{q}) \geq
\Omega(\exp{(q^{1-6\epsilon}/8)}\log_2{m})$. The lower bound obtained by
Fredman and Koml\'{o}s (1984) for perfect hashing implies that $n(m,q,q-1) =
\exp(\Omega(q)) \log_2 m$; similarly, the lower bound obtained by K\"{o}rner
(1986) for nearly-perfect hashing implies that $n(m,q,q) = \exp(\Omega(q))
\log_2 m$. These results show that the zero-error list-decoding capacity of the
$q/(q-1)$ channel with lists of size at most $q$ is exponentially small.
Extending these bounds, Chakraborty et al. (2006) showed that the capacity
remains exponentially small even if the list size is allowed to be as large as
$1.58q$. Our result implies that the zero-error list-decoding capacity of the
$q/(q-1)$ channel with list size $\epsilon q$ for $\epsilon<1/6$ is
$\exp{(\Omega(q^{1-6\epsilon}))}$. This resolves the conjecture raised by
Chakraborty et al. (2006) about the zero-error list-decoding capcity of the
$q/(q-1)$ channel at larger list sizes.
A major challenge in defending against adversarial attacks is the enormous space of possible attacks that even a simple adversary might perform. To address this, prior work has proposed a variety of ...defenses that effectively reduce the size of this space. These include randomized smoothing methods that add noise to the input to take away some of the adversary's impact. Another approach is input discretization which limits the adversary's possible number of actions. Motivated by these two approaches, we introduce a new notion of adversarial loss which we call distributional adversarial loss, to unify these two forms of effectively weakening an adversary. In this notion, we assume for each original example, the allowed adversarial perturbation set is a family of distributions (e.g., induced by a smoothing procedure), and the adversarial loss over each example is the maximum loss over all the associated distributions. The goal is to minimize the overall adversarial loss. We show generalization guarantees for our notion of adversarial loss in terms of the VC-dimension of the hypothesis class and the size of the set of allowed adversarial distributions associated with each input. We also investigate the role of randomness in achieving robustness against adversarial attacks in the methods described above. We show a general derandomization technique that preserves the extent of a randomized classifier's robustness against adversarial attacks. We corroborate the procedure experimentally via derandomizing the Random Projection Filters framework of \cite{dong2023adversarial}. Our procedure also improves the robustness of the model against various adversarial attacks.
We study the following natural question on random sets of points in \(\mathbb{F}_2^m\): Given a random set of \(k\) points \(Z=\{z_1, z_2, \dots, z_k\} \subseteq \mathbb{F}_2^m\), what is the ...dimension of the space of degree at most \(r\) multilinear polynomials that vanish on all points in \(Z\)? We show that, for \(r \leq \gamma m\) (where \(\gamma > 0\) is a small, absolute constant) and \(k = (1-\epsilon) \cdot \binom{m}{\leq r}\) for any constant \(\epsilon > 0\), the space of degree at most \(r\) multilinear polynomials vanishing on a random set \(Z = \{z_1,\ldots, z_k\}\) has dimension exactly \(\binom{m}{\leq r} - k\) with probability \(1 - o(1)\). This bound shows that random sets have a much smaller space of degree at most \(r\) multilinear polynomials vanishing on them, compared to the worst-case bound (due to Wei (IEEE Trans. Inform. Theory, 1991)) of \(\binom{m}{\leq r} - \binom{\log_2 k}{\leq r} \gg \binom{m}{\leq r} - k\). Using this bound, we show that high-degree Reed-Muller codes (\(\text{RM}(m,d)\) with \(d > (1-\gamma) m\)) "achieve capacity" under the Binary Erasure Channel in the sense that, for any \(\epsilon > 0\), we can recover from \((1 - \epsilon) \cdot \binom{m}{\leq m-d-1}\) random erasures with probability \(1 - o(1)\). This also implies that \(\text{RM}(m,d)\) is also efficiently decodable from \(\approx \binom{m}{\leq m-(d/2)}\) random errors for the same range of parameters.
The multiplicity Schwartz-Zippel lemma asserts that over a field, a low-degree polynomial cannot vanish with high multiplicity very often on a sufficiently large product set. Since its discovery in a ...work of Dvir, Kopparty, Saraf and Sudan SIAM J. Comput., 2013, the lemma has found numerous applications in both math and computer science; in particular, in the definition and properties of multiplicity codes by Kopparty, Saraf and Yekhanin J. ACM, 2014. In this work, we show how to algorithmize the multiplicity Schwartz-Zippel lemma for arbitrary product sets over any field. In other words, we give an efficient algorithm for unique decoding of multivariate multiplicity codes from half their minimum distance on arbitrary product sets over all fields. Previously, such an algorithm was known either when the underlying product set had a nice algebraic structure: for instance, was a subfield (by Kopparty ToC, 2015) or when the underlying field had large (or zero) characteristic, the multiplicity parameter was sufficiently large and the multiplicity code had distance bounded away from \(1\) (Bhandari, Harsha, Kumar and Sudan STOC 2021). In particular, even unique decoding of bivariate multiplicity codes with multiplicity two from half their minimum distance was not known over arbitrary product sets over any field. Our algorithm builds upon a result of Kim and Kopparty ToC, 2017 who gave an algorithmic version of the Schwartz-Zippel lemma (without multiplicities) or equivalently, an efficient algorithm for unique decoding of Reed-Muller codes over arbitrary product sets. We introduce a refined notion of distance based on the multiplicity Schwartz-Zippel lemma and design a unique decoding algorithm for this distance measure. On the way, we give an alternate analysis of Forney's classical generalized minimum distance decoder that might be of independent interest.
Recently, Cohen, Haeupler and Schulman gave an explicit construction of binary tree codes over polylogarithmic-sized output alphabet based on Pudl\'{a}k's construction of maximum-distance-separable ...(MDS) tree codes using totally-non-singular triangular matrices. In this short note, we give a unified and simpler presentation of Pudl\'{a}k and Cohen-Haeupler-Schulman's constructions.
We present a randomized algorithm that takes as input an undirected \(n\)-vertex graph \(G\) with maximum degree \(\Delta\) and an integer \(k > 3\Delta\), and returns a random proper \(k\)-coloring ...of \(G\). The distribution of the coloring is \emph{perfectly} uniform over the set of all proper \(k\)-colorings; the expected running time of the algorithm is \(\mathrm{poly}(k,n)=\widetilde{O}(n\Delta^2\cdot \log(k))\). This improves upon a result of Huber~(STOC 1998) who obtained a polynomial time perfect sampling algorithm for \(k>\Delta^2+2\Delta\). Prior to our work, no algorithm with expected running time \(\mathrm{poly}(k,n)\) was known to guarantee perfectly sampling with sub-quadratic number of colors in general. Our algorithm (like several other perfect sampling algorithms including Huber's) is based on the Coupling from the Past method. Inspired by the \emph{bounding chain} approach, pioneered independently by Huber~(STOC 1998) and H\"aggstr\"om \& Nelander~(Scand.{} J.{} Statist., 1999), we employ a novel bounding chain to derive our result for the graph coloring problem.
The multiplicity Schwartz-Zippel lemma bounds the total multiplicity of zeroes of a multivariate polynomial on a product set. This lemma motivates the multiplicity codes of Kopparty, Saraf and ...Yekhanin J. ACM, 2014, who showed how to use this lemma to construct high-rate locally-decodable codes. However, the algorithmic results about these codes crucially rely on the fact that the polynomials are evaluated on a vector space and not an arbitrary product set. In this work, we show how to decode multivariate multiplicity codes of large multiplicities in polynomial time over finite product sets (over fields of large characteristic and zero characteristic). Previously such decoding algorithms were not known even for a positive fraction of errors. In contrast, our work goes all the way to the distance of the code and in particular exceeds both the unique decoding bound and the Johnson bound. For errors exceeding the Johnson bound, even combinatorial list-decodablity of these codes was not known. Our algorithm is an application of the classical polynomial method directly to the multivariate setting. In particular, we do not rely on a reduction from the multivariate to the univariate case as is typical of many of the existing results on decoding codes based on multivariate polynomials. However, a vanilla application of the polynomial method in the multivariate setting does not yield a polynomial upper bound on the list size. We obtain a polynomial bound on the list size by taking an alternative view of multivariate multiplicity codes. In this view, we glue all the partial derivatives of the same order together using a fresh set \(z\) of variables. We then apply the polynomial method by viewing this as a problem over the field \(\mathbb{F}(z)\) of rational functions in \(z\).