Remove `parse_sequence_from_file` #110

davidbbeyer · 2025-01-20T15:09:27Z

Removes parse_sequence_from_file.

This function is no longer needed to parse sequence when pmb.df have been read from file since the parsing is done when using pmb.df #30

pm-blanco · 2025-01-21T09:08:54Z

@davidbbeyer thank you for taking care of this. Revisiting this method, I see that it was introduced as a temporary fix in calculate_HH because sequence was not a list when retrieved from pyMBE dataframes that had been read from file. However, we fixed this issue in #30 and now sequence should always be a list regardless if the pyMBE dataframe was read or not from file.

Since right now this function is simply adding non-functional lines of code to the library I suggest to deprecate it instead of adding coverage a dedicated CI test for it. What do you think @davidbbeyer?

I tested that calculate_HH works without needing parse_sequence_from_file by doing the following modifications:

in pyMBE.py

def calculate_HH(self, molecule_name, pH_list=None, pka_set=None):
        if pH_list is None:
            pH_list=np.linspace(2,12,50)
        if pka_set is None:
            pka_set=self.get_pka_set()
        self.check_pka_set(pka_set=pka_set)
        charge_number_map = self.get_charge_number_map()
        Z_HH=[]
        for pH_value in pH_list:
            Z=0
            index = self.df.loc[self.df['name'] == molecule_name].index[0].item()
            residue_list = self.df.at [index,('residue_list','')]
            sequence = self.df.at [index,('sequence','')]
            if np.any(pd.isnull(sequence)):
                # Molecule has no sequence
                for residue in residue_list:
                    list_of_particles_in_residue = self.search_particles_in_residue(residue)
                    for particle in list_of_particles_in_residue:
                        if particle in pka_set.keys():
                            if pka_set[particle]['acidity'] == 'acidic':
                                psi=-1
                            elif pka_set[particle]['acidity']== 'basic':
                                psi=+1
                            else:
                                psi=0
                            Z+=psi/(1+10**(psi*(pH_value-pka_set[particle]['pka_value'])))
                Z_HH.append(Z)
            else:
                """
                This lines are no longer needed
                # Molecule has a sequence
                if not isinstance(sequence, list):
                    # If the df has been read by file, the sequence needs to be parsed.
                    sequence = self.parse_sequence_from_file(sequence=sequence)
                """
                for name in sequence:
                    if name in pka_set.keys():
                        if pka_set[name]['acidity'] == 'acidic':
                            psi=-1
                        elif pka_set[name]['acidity']== 'basic':
                            psi=+1
                        else:
                            psi=0
                        Z+=psi/(1+10**(psi*(pH_value-pka_set[name]['pka_value'])))
                    else:
                        state_one_type = self.df.loc[self.df['name']==name].state_one.es_type.values[0]
                        Z+=charge_number_map[state_one_type]
                Z_HH.append(Z)

        return Z_HH

In samples/plot_branched_polyampholyte.py

pmb.read_pmb_df("df.csv")
""" Read the definitions from file instead than defining them on the same script
# Acidic particle
pmb.define_particle(
    name = "A",
    acidity = "acidic",
    pka = 4,
    sigma = 1*pmb.units('reduced_length'),
    epsilon = 1*pmb.units('reduced_energy'))
    
# Basic particle
pmb.define_particle(
    name = "B",
    acidity = "basic",
    pka = 9,
    sigma = 1*pmb.units('reduced_length'),
    epsilon = 1*pmb.units('reduced_energy'))

# Define different residues
pmb.define_residue(
    name = "Res_1",
    central_bead = "I",
    side_chains = ["A","B"])
    
pmb.define_residue(
    name = "Res_2",
    central_bead = "I",
    side_chains = ["Res_1"])

# Define the molecule
pmb.define_molecule(
    name = "polyampholyte",
    residue_list = 5*["Res_1"] + 5*["Res_2"])
"""

After these modification, the pyMBE pipeline for the polyampholyte sample case still runs fine:

python3 branched_polyampholyte.py 
python3 analyze_time_series.py --data_folder time_series/branched_polyampholyte 
python3 plot_branched_polyampholyte.py

davidbbeyer · 2025-01-21T12:43:05Z

@pm-blanco Even better, then we can just remove it.

pm-blanco · 2025-01-21T13:06:37Z

Agreed. Can you please do it @davidbbeyer ?

davidbbeyer · 2025-01-21T13:25:24Z

Done.

pm-blanco

Looks good to me!

Add parsing test

8646467

pm-blanco self-requested a review January 21, 2025 08:38

pm-blanco assigned davidbbeyer Jan 21, 2025

pm-blanco added the ci-improvement label Jan 21, 2025

pm-blanco added this to the pyMBE 1.0.0 milestone Jan 21, 2025

Remove parse_sequence_from_file

c223e52

update CHANGELOG

3a15161

pm-blanco added code quality and removed ci-improvement labels Jan 21, 2025

pm-blanco changed the title ~~Add parsing test~~ Remove parse_sequence_from_file Jan 21, 2025

pm-blanco approved these changes Jan 21, 2025

View reviewed changes

pm-blanco merged commit 6149dc1 into pyMBE-dev:main Jan 21, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove `parse_sequence_from_file` #110

Remove `parse_sequence_from_file` #110

davidbbeyer commented Jan 20, 2025 •

edited by pm-blanco

Loading

pm-blanco commented Jan 21, 2025 •

edited

Loading

davidbbeyer commented Jan 21, 2025

pm-blanco commented Jan 21, 2025

davidbbeyer commented Jan 21, 2025

pm-blanco left a comment

Remove parse_sequence_from_file #110

Remove parse_sequence_from_file #110

Conversation

davidbbeyer commented Jan 20, 2025 • edited by pm-blanco Loading

pm-blanco commented Jan 21, 2025 • edited Loading

davidbbeyer commented Jan 21, 2025

pm-blanco commented Jan 21, 2025

davidbbeyer commented Jan 21, 2025

pm-blanco left a comment

Choose a reason for hiding this comment

Remove `parse_sequence_from_file` #110

Remove `parse_sequence_from_file` #110

davidbbeyer commented Jan 20, 2025 •

edited by pm-blanco

Loading

pm-blanco commented Jan 21, 2025 •

edited

Loading