View More View Less
  • 1 Sona College of Technology, India
  • | 2 Einstein College of Engineering, India
  • | 3 Sona College of Technology, India
Open access

Cluster analysis plays a foremost role in identifying groups of genes that show similar behavior under a set of experimental conditions. Several clustering algorithms have been proposed for identifying gene behaviors and to understand their significance. The principal aim of this work is to develop an intelligent rough clustering technique, which will efficiently remove the irrelevant dimensions in a high-dimensional space and obtain appropriate meaningful clusters. This paper proposes a novel biclustering technique that is based on rough set theory. The proposed algorithm uses correlation coefficient as a similarity measure to simultaneously cluster both the rows and columns of a gene expression data matrix and mean squared residue to generate the initial biclusters. Furthermore, the biclusters are refined to form the lower and upper boundaries by determining the membership of the genes in the clusters using mean squared residue. The algorithm is illustrated with yeast gene expression data and the experiment proves the effectiveness of the method. The main advantage is that it overcomes the problem of selection of initial clusters and also the restriction of one object belonging to only one cluster by allowing overlapping of biclusters.

Abstract

Cluster analysis plays a foremost role in identifying groups of genes that show similar behavior under a set of experimental conditions. Several clustering algorithms have been proposed for identifying gene behaviors and to understand their significance. The principal aim of this work is to develop an intelligent rough clustering technique, which will efficiently remove the irrelevant dimensions in a high-dimensional space and obtain appropriate meaningful clusters. This paper proposes a novel biclustering technique that is based on rough set theory. The proposed algorithm uses correlation coefficient as a similarity measure to simultaneously cluster both the rows and columns of a gene expression data matrix and mean squared residue to generate the initial biclusters. Furthermore, the biclusters are refined to form the lower and upper boundaries by determining the membership of the genes in the clusters using mean squared residue. The algorithm is illustrated with yeast gene expression data and the experiment proves the effectiveness of the method. The main advantage is that it overcomes the problem of selection of initial clusters and also the restriction of one object belonging to only one cluster by allowing overlapping of biclusters.

Introduction

A lot of techniques have emerged for analyzing microarray gene expression data, but clustering proves to be the primary [1] and the most popular approach for analyzing the expressions of thousands of genes and has been successful in many applications [2]. The process of clustering is the assignment of a set of observations into subsets called clusters so that observations in the same cluster are similar in some sense. The objects within a cluster are highly similar and objects in different clusters are highly dissimilar. This ultimately increases intraclass similarity but decreases interclass similarity. Clustering is a technique of unsupervised learning that does not have the need of prior knowledge of the groups to which the objects or data members belong to. Varieties of clustering algorithms have been proposed for analyzing gene expression data [3]. The conventional clustering algorithms like k-means, hierarchical, SOM, and other density-based methods are very common. The results produced by these methods are consistent for microarray experiments performed on homogeneous conditions. However, when the experimental conditions vary to a great extent, the clusters are no longer correct. This led to a promising alternative prototype of clustering, biclustering.

Biclustering algorithms, also referred to as co-clustering, capture consistency exhibited by subset of genes over subset of conditions. An increasing number of biclustering algorithms have also been proposed for identifying gene patterns [4–9]. Most of the above-mentioned algorithms find exclusive biclusters, but most of these biclusters prove inappropriate in the biological context. Since biological processes are dependent on each other, many genes participate in two or more different processes. Each gene therefore should be grouped to multiple biclusters whenever biclusters are identified.

This problem is addressed in the proposed biclustering algorithm by introducing the framework of generalized rough sets into biclustering. The theory of rough sets is an issue of intense significance in computational intelligence research. The extension of this theory into clustering provides a necessary and potentially useful addition to the range of cluster analysis techniques available to researchers. The concept of rough sets has been introduced into clustering lately and a very few clustering algorithms have also been developed based on rough set theory [10–12]. A technique combining k-means and rough set approaches proposed in [13] introduced the concept of upper and lower bounds to the k-means centroid. An enhancement to this technique was proposed in [14]. But the main drawback is that these techniques do not address the problem of selection of initial parameters.

This work aims in developing a biclustering algorithm that helps in efficiently identifying all subset of genes that exhibit similar patterns under a subset of experimental conditions. The problem of selection of initial seeds is also addressed here and the quality of the overlapping biclusters is refined based on mean squared residue. Moreover, the proposed approach allows us to profit from the major advantages of rough methods [15], over the crisp techniques. One important aspect of rough sets, bearing significant importance in gene expression clustering, is that it facilitates the identification of overlapping clusters. Hence, by allowing genes to be members of various clusters, rough methods can more suitably predict the complex relations governing gene regulation.

Rough bi-correlation clustering

The structural framework

In this section, we will present the framework and the methodology we follow to determine the biclusters using Pearson’s correlation coefficient and mean squared residue. Subsequently, we provide detailed description of the ROBICOR algorithm and how the algorithm is integrated in the rough clustering process to guide clustering. We explain how the algorithm automatically determines the number of clusters present in a dataset and produces biclusters with upper and lower approximations.

The ROBICOR algorithm is designed to be intelligent and more efficient. It is intelligent as it does not require the number of clusters as input. It is more efficient as it uses Pearson’s correlation coefficient and mean squared residue for producing high quality overlapping biclusters. The framework of the proposed model is shown in Figure 1. The proposed algorithm is also robust as it handles noisy data.

Figure 1.
Figure 1.

The proposed rough set based model for biclustering

Citation: Acta Microbiologica et Immunologica Hungarica Acta Microbiologica et Immunologica Hungarica 63, 2; 10.1556/030.63.2016.2.4

Preprocessing of data

Some genes in the gene expression matrix do not respond much to the experimental conditions and so do not actively participate in the biclustering of the data. These genes are called ‘flat genes’ and should be removed to provide good quality biclusters. For this, we use the formula proposed by Tang et al. [16]. Each gene vector with j conditions can be represented as gi = (ei1,ei2,…eij). A vector-cosine can be used to match each gene vector and with a predefined pattern H = (h1,h2,…hj) to determine the deviation in gene intensity values among samples as shown in Equation (1).
cos(θ)=j=1meij×hj/j=1meij2×j=1mhj2

Both the vectors are said to be more similar if the value of the cosine-vector is close to 1. A threshold value is chosen and the genes which have cos(θ) values more than the specified threshold value are removed. This process removes the gene vectors that are more similar to the predefined pattern. The data is now preprocessed and in shape for clustering.

The biclustering algorithm

Usually, gene expression data is arranged in a form of a data matrix. Each row corresponds to one gene and each column to one condition. Each element of this matrix is a real number that represents the expression level of a gene under a specific experimental condition. The value of each element is usually the logarithm of the relative abundance of the mRNA of the gene under the specific condition. Pearson correlation coefficient for measuring similarity between expression patterns of two genes xi and xj is defined as
Sim(xi,xj)=l=1m(xilx¯i)(xjlx¯j)/l=1m(xilx¯i)2l=1m(xjlx¯j)2
where xil and xjl are l-th expression values of the i-th and j-th genes, respectively. The terms x¯i and x¯j are mean values over m expression values (corresponding to microarray experiments) of the i-th and j-th genes, respectively. The value of m gives the number of conditions or samples under which the genes exhibit the expression patterns.

The proposed ROBICOR algorithm effectively and efficiently approximates a set of overlapping biclusters simultaneously with relative lower mean squared residue. The step 1 of the algorithm produces biclusters for which it uses correlation coefficient metric and mean squared residue. The ROBICOR uses Pearson correlation coefficient for measuring the similarity between expression patterns of two genes xi and xj, and is defined in Equation (2). This idea of generating biclusters using correlation coefficient was inferred from the BCCA algorithm proposed by Bhattacharya and De [17]. BCCA uses only Pearson’s correlation coefficient to detect the biclusters where as the newly designed ROBICOR uses mean squared residue in addition to Pearson’s correlation coefficient to detect biclusters of high quality. The ROBICOR initially starts with a pair of genes and finds the conditions under which they are co-regulated. For any pair of genes (gi, gj), the algorithm finds the similarity of the genes under all conditions. Sim(xi,xj) > θ indicates that xi and xj are similarly expressed, i.e., their expression patterns are altering in a similar way. If the similarity is less than θ, then the algorithm finds out the condition, when eliminated gives the maximum increase in the correlation coefficient. That condition is eliminated from the condition set and this step is repeated until the similarity exceeds θ and the number of conditions involved is not less than some specified number of conditions in the condition set. If they are not correlated, it moves on to find the next pair of genes. Otherwise, the algorithm forms a bicluster with the initial two genes and the conditions. The bicluster is further refined by including a new gene based not only on the correlation values with all the other genes in the bicluster, but also on the mean squared residue of the bicluster. When a new gene is added to the bicluster, the mean squared residue of the bicluster is calculated.

Algorithm

Step 1: Detect bicluster set Biclust.

Biclust = ∅;

For each pair of genes (gi,gj), i <> j, do:

{

 Set I = (gi,gj) and J = set of all conditions and m = |j|

 While Sim (gi,gj) < θ, gi, gj ∈I and m ≥ r, do:

  {

   From m expression values, find out the elimination of a condition y which when eliminated from J will cause maximum increase in Sim (xi, xj)

   Remove y from the set J and m = m − 1.

  }

 If Sim (xi,xj) ≥ θ, for gi, gj ∈ I over m expression values in J, where m ≥ r, then

  {

   Remove the set I from X (the set of all genes);

   For each gp ∈ X, do:

   {

    If Sim(gi,gp) ≥ θ, for all gi ∈ I over m expression values in J, and If Kgp ≤ δ, then set (I = I ∪{gp})

    Remove gp from the Set X;

   }

   c.Set ccount = ccount+1; Biclust = Biclust ∪ I;

  }

}

Step 2: Detect upper and lower approximations

For each bicluster Bi ∈ Biclust = {B1,B2,…,Bn}

{

 For each object v in bicluster Bj do

 {

  If v ∈ one and only bicluster Bj and If KXj ≤ δ, then

  {v belongs to the lower bound of Bj.}

  Else

  {Compute the difference in the mean squared residue for

   each bicluster (v inserted and removed)

  Let dmin be the minimum mean squared residue.

  Find the ratio between dmin and mean squared residue

   of other clusters

  If the ratio < = ω, add the cluster to set P.

  If P ≠ ∅, insert v to boundary region bicluster with

   dmin and all B¯j with j ∈ P;

  Else insert v to the lower bound of the bicluster with

   dmin.

  }

 }

}

The roughness measure

The mean squared residue of a bicluster (I,J) as defined by Cheng and Church [18] is
K(I,J)=1|I||J|iI,jJrij2,
where the residue
rij=dijdiJdIj+dIJ
is an indicator of the degree of coherence of an entry with remaining entries of a bicluster. Also the base of the gene gi is defined as
diJ=jJdij/|J|
And the base of the condition Ci is defined as
dIj=iIdij/|I|
And the base of the bicluster dIJ is defined as
dIJ=iI,jJdij|I||J|

Only if the mean squared residue is less than δ, the gene is placed in the bicluster. Thus, the algorithm reduces the possibility of misplacing a gene in a bicluster. Furthermore, the lower the mean squared residue, the stronger is the coherence exhibited by the bicluster. The mean squared residue well indicates the general coherence of a bicluster. The lower the mean squared residue, the higher is the quality of the bicluster. By the end of step 1, the possible number of high quality biclusters and the objects in each bicluster are obtained.

Based on the concepts of rough sets [16, 19], we can consider each bicluster as a generalized rough set with two approximations, a lower bound and an upper bound. The genes or conditions of the lower approximation belong only to the bicluster, whereas the members of the upper approximation may belong to one or more biclusters. This property leads to overlapping among corresponding biclusters. Given a gene expression data matrix R, for each object (gene or condition), there are three possibilities in the bicluster membership:

  1. not belonging to any biclusters in R or
  2. belonging to the lower approximation of the bicluster or
  3. belonging to the upper approximation of the bicluster in R.

The step 2 of the algorithm gives the procedure to place the genes in the upper and lower approximations for which it uses the ratio of the mean squared residue of the biclusters to which the gene belongs to. To determine the bicluster membership, we follow the following procedure: For each object vector v, an element of S = {C1,C2,…,Cn}, where C1,…,Cn represents the biclusters generated, find the difference in the mean squared residue before and after the removal of v using Equation 8.
ΔK(v,Xj)=K(xj)K(xj).
Let K′(xj) and K(xj) be the mean squared residue of the biclusters after and before v is removed from the bicluster Xj, respectively. Find the minimum of this value dmin.
dmin=min1jkΔK(v,Xj)
Using Equation (9), the bicluster that has the minimum mean squared residue when gene v is inserted into it is found. Using Equation (10), the ratio of the bicluster (R) with minimum mean squared residue and others is found.
R=ΔK(v,Xj)/ΔK(v,Xi)
Equation (10) helps to resolve the membership of the gene v. Let
D={j|ΔK(v,Xi)/ΔK(v,Xi)ω,ij}
i.e., the set D consists of all biclusters for which the ratio R is less than ω. Furthermore, if D = ∅, then v is placed in the upper boundary of all biclusters present in the set D. Otherwise, if D = ∅, then v is placed in the lower boundary of the bicluster which has the minimum mean squared residue.

The parameters ω and δ used in this procedure are predefined thresholds. The parameter δ is to make sure that all biclusters discovered have mean squared residues less than δ to improve cluster quality. The parameter ω determines the degree of overlapping among these biclusters. The set D is calculated using the formula given in Equation (11). The concept of using mean squared residue for rough biclustering was proposed by Wang et al. [13]. In the proposed method, we have used Pearson’s correlation and mean squared residue for biclustering, and we use the ratio of the mean squared residue of the clusters for finding lower and upper approximations.

The initial part of the proposed algorithm ROBICOR generates the number of biclusters. Then the algorithm goes one step further to find the quality of the biclusters generated and also the upper and lower bounds for each bicluster. We have used mean squared residue to determine the bicluster quality and the membership of objects in the lower and upper approximation of the bicluster. The ratio between the mean squared residue of a bicluster and the volume of the bicluster depicts the overall quality of the bicluster. The average of this ratio is also found to decide about the degree of overlapping.

Performance evaluation

The size of the biclusters obtained by ROBICOR depends on the correlation threshold value θ. The optimum correlation threshold value was selected by varying correlation threshold between 0 and 1. This process is very time consuming. The algorithm was experimented for θ values in the range {0.72, 0.74, 0.76, 0.78, 0.80, 0.82, 0.84, 0.86, 0.88}. This variation in the cluster accuracy is depicted using line graph of Figure 2. It has been noted that the relative accuracy (relative accuracy is the accuracy of the algorithm represented in percentage) of the algorithm is 83% when the value of θ is 0.80. As the algorithm yields better results when θ is 0.8, we have chosen the threshold value to be 0.80.

Figure 2.
Figure 2.

The relative accuracy of ROBICOR for various values of correlation threshold θ

Citation: Acta Microbiologica et Immunologica Hungarica Acta Microbiologica et Immunologica Hungarica 63, 2; 10.1556/030.63.2016.2.4

The degree of overlapping between the biclusters is determined by the parameter ω. A range of values {0.25, 0.3, 0.35, 0.4, 0.45, 5, 0.55, 0.6, 0.65, 0.7) were experimented for the datasets. Figure 3 shows the degree of overlapping of the clusters for different values of ω. It has been noted that a value of 0.6 for ω yields an optimal result. Moreover, it is interesting to note that our algorithm delivers meaningful results over the range [0.5, 0.7] of ω, where the overlapping degree increases dramatically and stabilizes as shown in Figure 3. The values for the parameters adopted in ROBICOR are presented in Table I.

Figure 3.
Figure 3.

Relative increase in the degree of overlapping when varying ω, the roughness threshold for ROBICOR algorithm

Citation: Acta Microbiologica et Immunologica Hungarica Acta Microbiologica et Immunologica Hungarica 63, 2; 10.1556/030.63.2016.2.4

Table I.

Optimum values for the parameters used in ROBICOR

ProcedureParameterValue
Generating the initial biclustersThreshold for correlation coefficient θ0.8
Rough clusteringOverlapping threshold ω0.6
Mean squared residue threshold300

The mean squared residue of the biclusters produced by the proposed algorithm is analyzed and evaluated. The lower the mean squared residue, the higher is the quality of the bicluster. It has been found that for most of the biclusters, the mean squared residue value falls below 300. The threshold value 300 is chosen as stated in [19, 20]. Figure 4 shows the mean squared residue for the biclusters produced by ROBICOR for the yeast dataset. It can be observed that the mean squared residue of most of the biclusters falls below 350.

Figure 4.
Figure 4.

Mean squared residue of the biclusters detected using ROBICOR algorithm

Citation: Acta Microbiologica et Immunologica Hungarica Acta Microbiologica et Immunologica Hungarica 63, 2; 10.1556/030.63.2016.2.4

Results

The performance of the proposed algorithm was experimented with two different sets of data. The different data sets, namely yeast gene expression data, colon cancer data, and leukemia dataset were considered for experimentation. The data set is 384 × 17 matrix. A total of 384 genes were clustered based on 17 experimental conditions. Next, the algorithm was experimented with colon cancer data set which contains expression levels of 2,000 genes taken from 62 different samples out of which 50 genes were chosen across all 62 samples. When applied on 384 × 17 yeast data matrix, it produced 450 biclusters and when applied on 500 × 36 colon data matrix, 322 biclusters were produced.

The cluster profile plot of four biclusters depicting the expression level of genes in each bicluster generated by ROBICOR algorithm when applied on yeast and colon expression data is shown in Figures 5 and 6. The profile plot of only four randomly selected clusters is shown to depict the accuracy of the algorithm. Moreover, we can also observe that the genes in the biclusters are highly correlated as their profile patterns are varying similarly. The genes falling in the lower and boundary regions are depicted with the color difference in the profile pattern. The profile pattern of genes in boundary region is depicted with red lines and in the lower approximation is depicted with blue lines. The placement of genes in lower and boundary regions of biclusters based on the mean squared residue is presented in Table II. The membership of a few randomly selected genes in three different biclusters based on their mean squared residue values is clearly presented in Table II. The resulting biclusters with the details of number of genes and conditions, number of genes in the lower approximation and boundary region are shown in Table III. The algorithm depicts and differentiates the members certainly classified as the member of the clusters/biclusters and the members those possibly belong to the clusters/biclusters.

Figure 5.
Figure 5.

Cluster profile plot of bicluster1 and bicluster2 produced by ROBICOR when applied on colon gene expression

Citation: Acta Microbiologica et Immunologica Hungarica Acta Microbiologica et Immunologica Hungarica 63, 2; 10.1556/030.63.2016.2.4

Figure 6.
Figure 6.

Cluster profile plot of bicluster1 and bicluster2 produced by ROBICOR when applied on yeast gene expression

Citation: Acta Microbiologica et Immunologica Hungarica Acta Microbiologica et Immunologica Hungarica 63, 2; 10.1556/030.63.2016.2.4

Table II.

Membership of genes in yeast expression data. The ratio between the mean squared residue of the bicluster with minimum mean squared residue and other mean squared residue value determines the membership of a gene

Gene IDMean squared residue of biclustersMembership of the gene
Cluster 24Cluster 55Cluster 86
G89240.26121.12124.03Placed in the boundary region of Cluster 24, Cluster 55, and Cluster 86
G107130.36290.16321.00Placed in the lower approximation of Cluster 24
G260319.93155.37171.83Placed in the boundary region of Cluster 55 and Cluster 86
G301146.39149.63394.26Placed in the boundary region of Cluster 24 and Cluster 55
Table III.

Number of genes in the lower and boundary region of clusters produced by ROBICOR when applied on yeast expression data

C4C15C26C48C75
Total number of genes17208515
Total number of conditions/attributes8156104
Number of genes in lower approximation37536
Number of genes in boundary region14133211

Performance metrics

For evaluating the performance of the biclustering algorithm ROBICOR, Adjusted Rand index (ARI), Silhouettes index (SI), and Davies–Bouldin (DB) index are used. ARI is applied on a 50 × 10 synthetic data set while SI and DB index are applied on both artificial and real datasets. The average ARI and SI values are reported in Table IV for 10 runs of each algorithm. The value N in the table indicates the number of clusters. The results indicate that for the synthetic dataset, the proposed ROBICOR shows significant improvement in the ARI and SI values when compared with other clustering and biclustering algorithms. The SI and DB index of ROBICOR for the three real datasets is compared with the other algorithms in Table V. The results show that SI values of ROBICOR algorithm are closer to 1 when applied on the three different real data sets. It has also been observed that the DB index of ROBICOR is minimum when compared with the other biclustering algorithms.

Table IV.

Comparison of ROBICOR with other algorithms in terms of ARI and SI for synthetic dataset

AlgorithmNARISI
ROBICOR220.64550.5799
BCCA180.55480.5022
ROB100.54660.4099
CC100.51260.3716
SCAD100.47130.3111
Rough k-means100.54920.3875
Table V.

Performance comparison of ROBICOR with other algorithms in terms of SI and DB index

DatasetClustering algorithmNumber of clustersSilhouettes indexDB index
YeastROBICOR3500.66491.7462
BCCA3260.56121.8333
ROB500.50442.0666
CC500.41512.1264
SCAD500.51362.0122
Rough k-means500.56321.9121
Colon cancerROBICOR2390.54971.8847
BCCA2090.50291.8997
ROB500.42222.2744
CC500.37662.8654
SCAD500.42452.3148
Rough k-means500.41441.9788
LeukemiaROBICOR3080.58421.7113
BCCA2870.57661.8411
ROB500.51112.1688
CC500.49332.7613
SCAD500.42882.4254
Rough k-means500.41551.9142

Test for statistical significance

The p-values produced by Wilcoxon’s rank sum test are calculated for all algorithms participating in the comparison. The ARI scores for the artificial data and SI scores for the real data sets are recorded for 10 consecutive runs of the algorithms. For the null hypothesis, it is assumed that the median values of two groups show no significant difference between them and the alternative hypothesis is that the median values of the two groups show significant difference in them. Table VI reports p-values produced by Wilcoxon’s rank sum test for comparison of two groups at a time. All the p-values reported in the table are p-values received when ROBICOR is compared with other algorithms (ROBICOR as group one and the other algorithm as group two). It is clearly evident from the values that all the p-values are less than 0.05 (5% significance level). It has also been noted that the median values of ROBICOR algorithm are better compared with all other algorithms. The small value of p-values (less than 0.05) is a strong proof against the null hypothesis.

Table VI.

p-values of comparing ROBICOR with other algorithms

Datasetp-value
BCCAROBCCSCAD
Artificial dataset4.7 × E−44.6 × E−44.2 × E−55.2 × E−5
Yeast1.7 × E−44.3 × E−42.8 × E−44.5 × E−4
Colon cancer2.4 × E−33.7 × E−51.7 × E−32.6 × E−5
Leukemia3.7 × E−45.2 × E−53.5 × E−53.4 × E−4

Conclusion

Here, we have proposed and developed a biclustering algorithm called ROBICOR based on Pearson correlation coefficient and mean squared residue as a similarity measure. The algorithm finds group of genes that show similar pattern in their expression profiles over a subset of conditions. The results clearly demonstrate that the genes in a bicluster obtained by ROBICOR are not only highly correlated but the clusters are also highly coherent. Our method also finds a set of biclusters with a reasonable degree of overlapping associating each bicluster with a lower and an upper approximation. The proposed method is found to be more efficient than many existing biclustering algorithms.

Conflict of Interest

The authors declare no conflict of interest.

Appendix

The algorithm ROBICOR is implemented in C language. To invoke the algorithm written in C language in java interface, Java Native Interface (JNI) has been used. JNI is a mechanism that allows a Java program to call a function in a C or C++ program. For all the C functions, shared library files are created with .dll extension or .so (linux) extension. The native method has been declared in Java and the shared library files have been loaded before the native method is called. A C header file containing function prototypes for the native methods has also been created. To improve the efficiency of the C programs, the input text file is converted into binary file and every access to the input file is done on the binary file.

References

  • 1.

    Wang, H., Wang, Z., Li, X., Gong, B., Feng, L., Zhou, Y.: A robust approach based on Weibull distribution for clustering gene expression data. Algorithms Mol Bio 6, 614 (2011).

    • Search Google Scholar
    • Export Citation
  • 2.

    Stekel, D: Microarray Bioinform. Cambridge University Press, Cambridge, UK, 2006.

  • 3.

    Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: A survey. IEEE Trans Knowledge Data Eng 16, 13701386 (2004).

  • 4.

    Madeira, S. C., Oliveira, A. L.: Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Trans Comput Biol Bioinform 1, 2445 (2004).

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 5.

    Yang, E., Foteinou, P. T., King, K. R., Yarmush, M. L., Androulakis, I. P.: A novel non-overlapping bi-clustering algorithm for network generation using living cell array data. Bioinformatics 23, 2306 (2007).

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 6.

    Pensa, R. G., Boulicaut, J.-F.: Constrained co-clustering of gene expression data. In Proceedings of the 2008 SIAM International Conference on Data Mining, 2008.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 7.

    Tsai, C.-Y., Chiu, C.-C.: A novel microarray biclustering algorithm. Int J Math Comput Phys Elect Comput Eng 4, 256 (2010).

  • 8.

    Prelic, A., Bleuler, S., Zimmermann, P., Wille, A., Bühlmann, P., Gruissem, W., Hennig, L., Thiele, L., Zitzler, E.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22, 11221129 (2006).

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 9.

    Frigui, H., Nasraoui, O.: Unsupervised learning of prototypes and attribute weights. Pattern Recogn 37, 567581 (2004).

  • 10.

    Emilyn, J. J., Ramar, K.: An Intelligent mining framework based on rough sets for clustering gene expression data. J Appl Sci 12, 19321938 (2012).

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 11.

    Shi, P.: Clustering fuzzy web transactions with rough k-means. In AST 09 Proceedings of the 2009 International e-Conference on Advancd Science and Technology, IEEE Computer Society, Washington, DC, 2009, pp. 4851.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 12.

    Wang, R., Miao, D., Li, G., Zhang, H.: Rough overlapping biclustering of gene expression data. In Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, Harvard Medical School, Boston, MA, 2007, pp. 828834.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 13.

    Lingras, P., West, C., Interval set clustering of web users with rough k-means. J Intel Inf Syst 23, 516 (2004).

  • 14.

    Peters, G.: Some refinements of rough k-means clustering. Pattern Recogn 39, 14811491 (2006).

  • 15.

    Lingras, P., Yan, P. R., Hogo, M.: Rough set based clustering: Evolutionary, neural, and statistical approaches. In Proceedings of the First Indian International Conference on Artificial Intelligence, IICAI, Hyderabad, India, 2003, pp. 10741087.

    • Search Google Scholar
    • Export Citation
  • 16.

    Tang, C., Zhang, L., Zhang, A., Ranmanathan, M.: Interrelated two-way clustering: An unsupervised approach for gene expression data analysis. In Proceedings of the 2nd IEEE International Symposium on Bioinformatics and Bioengineering, 2001, pp. 4148.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 17.

    Bhattacharya, A., De, R. K.: Bi-correlation clustering algorithm for determining a set of co-regulated genes. Bioinformatics 25, 27952280 (2009).

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 18.

    Cheng, Y., Church, G. M.: Biclustering of expression data. Proc Int Conf Intell Syst Mol Biol 8, 93103 (2000).

  • 19.

    Pawlak, Z.: Rough sets. Int J Comput Inform Sci 2, 341356 (1982).

  • 20.

    Jiong, Y., Haixun, W., Wei, W., Philip, Yu., Uiuc, I., Chapel, U., Hill, I., Watson, T. J.: Enhanced biclustering on expression data. In Proceedings of 3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE, Bethesda, MD, 2003.

    • Search Google Scholar
    • Export Citation
  • 1.

    Wang, H., Wang, Z., Li, X., Gong, B., Feng, L., Zhou, Y.: A robust approach based on Weibull distribution for clustering gene expression data. Algorithms Mol Bio 6, 614 (2011).

    • Search Google Scholar
    • Export Citation
  • 2.

    Stekel, D: Microarray Bioinform. Cambridge University Press, Cambridge, UK, 2006.

  • 3.

    Jiang, D., Tang, C., Zhang, A.: Cluster analysis for gene expression data: A survey. IEEE Trans Knowledge Data Eng 16, 13701386 (2004).

  • 4.

    Madeira, S. C., Oliveira, A. L.: Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Trans Comput Biol Bioinform 1, 2445 (2004).

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 5.

    Yang, E., Foteinou, P. T., King, K. R., Yarmush, M. L., Androulakis, I. P.: A novel non-overlapping bi-clustering algorithm for network generation using living cell array data. Bioinformatics 23, 2306 (2007).

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 6.

    Pensa, R. G., Boulicaut, J.-F.: Constrained co-clustering of gene expression data. In Proceedings of the 2008 SIAM International Conference on Data Mining, 2008.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 7.

    Tsai, C.-Y., Chiu, C.-C.: A novel microarray biclustering algorithm. Int J Math Comput Phys Elect Comput Eng 4, 256 (2010).

  • 8.

    Prelic, A., Bleuler, S., Zimmermann, P., Wille, A., Bühlmann, P., Gruissem, W., Hennig, L., Thiele, L., Zitzler, E.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22, 11221129 (2006).

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 9.

    Frigui, H., Nasraoui, O.: Unsupervised learning of prototypes and attribute weights. Pattern Recogn 37, 567581 (2004).

  • 10.

    Emilyn, J. J., Ramar, K.: An Intelligent mining framework based on rough sets for clustering gene expression data. J Appl Sci 12, 19321938 (2012).

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 11.

    Shi, P.: Clustering fuzzy web transactions with rough k-means. In AST 09 Proceedings of the 2009 International e-Conference on Advancd Science and Technology, IEEE Computer Society, Washington, DC, 2009, pp. 4851.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 12.

    Wang, R., Miao, D., Li, G., Zhang, H.: Rough overlapping biclustering of gene expression data. In Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, Harvard Medical School, Boston, MA, 2007, pp. 828834.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 13.

    Lingras, P., West, C., Interval set clustering of web users with rough k-means. J Intel Inf Syst 23, 516 (2004).

  • 14.

    Peters, G.: Some refinements of rough k-means clustering. Pattern Recogn 39, 14811491 (2006).

  • 15.

    Lingras, P., Yan, P. R., Hogo, M.: Rough set based clustering: Evolutionary, neural, and statistical approaches. In Proceedings of the First Indian International Conference on Artificial Intelligence, IICAI, Hyderabad, India, 2003, pp. 10741087.

    • Search Google Scholar
    • Export Citation
  • 16.

    Tang, C., Zhang, L., Zhang, A., Ranmanathan, M.: Interrelated two-way clustering: An unsupervised approach for gene expression data analysis. In Proceedings of the 2nd IEEE International Symposium on Bioinformatics and Bioengineering, 2001, pp. 4148.

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 17.

    Bhattacharya, A., De, R. K.: Bi-correlation clustering algorithm for determining a set of co-regulated genes. Bioinformatics 25, 27952280 (2009).

    • Crossref
    • Search Google Scholar
    • Export Citation
  • 18.

    Cheng, Y., Church, G. M.: Biclustering of expression data. Proc Int Conf Intell Syst Mol Biol 8, 93103 (2000).

  • 19.

    Pawlak, Z.: Rough sets. Int J Comput Inform Sci 2, 341356 (1982).

  • 20.

    Jiong, Y., Haixun, W., Wei, W., Philip, Yu., Uiuc, I., Chapel, U., Hill, I., Watson, T. J.: Enhanced biclustering on expression data. In Proceedings of 3rd IEEE Symposium on BioInformatics and BioEngineering, BIBE, Bethesda, MD, 2003.

    • Search Google Scholar
    • Export Citation

 

The author instruction is available in PDF.
Please, download the file from HERE

Senior editors

Editor-in-Chief: Prof. Dóra Szabó (Institute of Medical Microbiology, Semmelweis University, Budapest, Hungary)

Managing Editor: Dr. Béla Kocsis (Institute of Medical Microbiology, Semmelweis University, Budapest, Hungary)

Co-editor: Dr. Andrea Horváth (Institute of Medical Microbiology, Semmelweis University, Budapest, Hungary)

Editorial Board

  • Prof. Éva ÁDÁM (Institute of Medical Microbiology, Semmelweis University, Budapest, Hungary)
  • Prof. Sebastian AMYES (Department of Medical Microbiology, University of Edinburgh, Edinburgh, UK.)
  • Dr. Katalin BURIÁN (Institute of Clinical Microbiology University of Szeged, Szeged, Hungary; Department of Medical Microbiology and Immunobiology, University of Szeged, Szeged, Hungary.)
  • Dr. Orsolya DOBAY (Institute of Medical Microbiology, Semmelweis University, Budapest, Hungary)
  • Prof. Ildikó Rita DUNAY (Institute of Inflammation and Neurodegeneration, Medical Faculty, Otto-von-Guericke University, Magdeburg, Germany; Center for Behavioral Brain Sciences (CBBS), Magdeburg, Germany)
  • Prof. Levente EMŐDY(Department of Medical Microbiology and Immunology, University of Pécs, Pécs, Hungary.)
  • Prof. Anna ERDEI (Department of Immunology, Eötvös Loránd University, Budapest, Hungary, MTA-ELTE Immunology Research Group, Eötvös Loránd University, Budapest, Hungary.)
  • Prof. Éva Mária FENYŐ (Division of Medical Microbiology, University of Lund, Lund, Sweden)
  • Prof. László FODOR (Department of Microbiology and Infectious Diseases, University of Veterinary Medicine, Budapest, Hungary)
  • Prof. József KÓNYA (Department of Medical Microbiology, University of Debrecen, Debrecen, Hungary)
  • Prof. Yvette MÁNDI (Department of Medical Microbiology and Immunobiology, University of Szeged, Szeged, Hungary)
  • Prof. Károly MÁRIALIGETI (Department of Microbiology, Eötvös Loránd University, Budapest, Hungary)
  • Prof. János MINÁROVITS (Department of Oral Biology and Experimental Dental Research, University of Szeged, Szeged, Hungary)
  • Prof. Béla NAGY (Centre for Agricultural Research, Institute for Veterinary Medical Research, Budapest, Hungary.)
  • Prof. István NÁSZ (Institute of Medical Microbiology, Semmelweis University, Budapest, Hungary)
  • Prof. Kristóf NÉKÁM (Hospital of the Hospitaller Brothers in Buda, Budapest, Hungary.)
  • Dr. Eszter OSTORHÁZI (Institute of Medical Microbiology, Semmelweis University, Budapest, Hungary)
  • Prof. Rozália PUSZTAI (Department of Medical Microbiology and Immunobiology, University of Szeged, Szeged, Hungary)
  • Prof. Peter L. RÁDY (Department of Dermatology, University of Texas, Houston, Texas, USA)
  • Prof. Éva RAJNAVÖLGYI (Department of Immunology, Faculty of Medicine, University of Debrecen, Debrecen, Hungary)
  • Prof. Ferenc ROZGONYI (Institute of Laboratory Medicine, Semmelweis University, Budapest, Hungary)
  • Prof. Zsuzsanna SCHAFF (2nd Department of Pathology, Semmelweis University, Budapest, Hungary)
  • Prof. Joseph G. SINKOVICS (The Cancer Institute, St. Joseph’s Hospital, Tampa, Florida, USA)
  • Prof. Júlia SZEKERES (Department of Medical Biology, University of Pécs, Pécs, Hungary.)
  • Prof. Mária TAKÁCS (National Reference Laboratory for Viral Zoonoses, National Public Health Center, Budapest, Hungary.)
  • Prof. Edit URBÁN (Department of Medical Microbiology and Immunology University of Pécs, Pécs, Hungary; Institute of Translational Medicine, University of Pécs, Pécs, Hungary.)

 

Editorial Office:
Akadémiai Kiadó Zrt.
Budafoki út 187-187, A/3, H-1117 Budapest, Hungary

Editorial Correspondence:
Acta Microbiologica et Immunologica Hungarica
Institute of Medical Microbiology
Semmelweis University
P.O. Box 370
H-1445 Budapest, Hungary
Phone: + 36 1 459 1500 ext. 56101
Fax: (36 1) 210 2959
E-mail: amih@med.semmelweis-univ.hu

 Indexing and Abstracting Services:

  • Biological Abstracts
  • BIOSIS Previews
  • CAB Abstracts
  • Chemical Abstracts
  • Global Health
  • Index Medicus
  • Index Veterinarius
  • Medline
  • Referativnyi Zhurnal
  • SCOPUS
  • Science Citation Index Expanded

2019  
Total Cites
WoS
485
Impact Factor 1,086
Impact Factor
without
Journal Self Cites
0,864
5 Year
Impact Factor
1,233
Immediacy
Index
0,286
Citable
Items
42
Total
Articles
40
Total
Reviews
2
Cited
Half-Life
5,8
Citing
Half-Life
7,7
Eigenfactor
Score
0,00059
Article Influence
Score
0,246
% Articles
in
Citable Items
95,24
Normalized
Eigenfactor
0,07317
Average
IF
Percentile
7,690
Scimago
H-index
27
Scimago
Journal Rank
0,352
Scopus
Scite Score
320/161=2
Scopus
Scite Score Rank
General Immunology and Microbiology 35/45 (Q4)
Scopus
SNIP
0,492
Acceptance
Rate
16%

 

Acta Microbiologica et Immunologica Hungarica
Publication Model Online only Hybrid
Submission Fee none
Article Processing Charge 1100 EUR/article
Regional discounts on country of the funding agency World Bank Lower-middle-income economies: 50%
World Bank Low-income economies: 100%
Further Discounts Editorial Board / Advisory Board members: 50%
Corresponding authors, affiliated to an EISZ member institution subscribing to the journal package of Akadémiai Kiadó: 100%
Subscription Information Online subsscription: 652 EUR / 812 USD
Online subscribers are entitled access to all back issues published by Akadémiai Kiadó for each title for the duration of the subscription, as well as Online First content for the subscribed content.
Purchase per Title Individual articles are sold on the displayed price.

Acta Microbiologica et Immunologica Hungarica
Language English
Size A4
Year of
Foundation
1954
Publication
Programme
2021 Volume 68
Volumes
per Year
1
Issues
per Year
4
Founder Magyar Tudományos Akadémia
Founder's
Address
H-1051 Budapest, Hungary, Széchenyi István tér 9.
Publisher Akadémiai Kiadó
Publisher's
Address
H-1117 Budapest, Hungary 1516 Budapest, PO Box 245.
Responsible
Publisher
Chief Executive Officer, Akadémiai Kiadó
ISSN 1217-8950 (Print)
ISSN 1588-2640 (Online)

Monthly Content Usage

Abstract Views Full Text Views PDF Downloads
Jan 2021 0 3 0
Feb 2021 0 6 6
Mar 2021 0 3 4
Apr 2021 0 1 1
May 2021 0 1 2
Jun 2021 0 4 2
Jul 2021 0 0 0